CSCI 431 Lecture Notes - Introduction to CSCI 431
What is a Programming Language?
Features that a programming language should have:
- universal - it can compute anything that can be computed (too
strict?)
- implementable - it can be implemented (obvious)
- efficient - it should be possible to efficiently encode algorithms
(a factor of 2-3 times slower matters a great deal!)
- syntax (form/structure) and semantics (meaning)
- Over the past few decades, thousands of programming languages have
been designed, but programming language design is by no means a dead
area. In only the past few years, a number of new languages have been
developed and have become prominent, including:
- Perl
- HTML
- Java
- AMPL
- Active VRML
Examples
Each of the new languages listed above meets some specialized goal.
For instance, AMPL is designed for expressing mathematics; HTML is a
mark-up language for hypertext documents; and Active VRML, which was
derived from the CAML family of languages, has been designed by
Microsoft to enable transmission of active virtual reality scenarios
across networks.
a few different forms of the expression x + 1
x + 1 C, Pascal, Ada, Algol
$x + 1 Perl
x 1 add Forth, Postscript
(+ x 1) Lisp, Scheme
add $1 $2 $3 Assembly language
What is this Subject About?
- the computer does exactly what it is told to do
- programming is the art of precisely describing or
commanding the computer to do something
- it is very difficult to ``describe'' or ``command'' something
precisely in English
- programming languages are languages that the computer knows and are
restricted so as to allow concise and exact specifications of tasks
- there are currently over 3000 computer languages, can't study
each one, must study what they share in common - their principles
- once the principles are understood, it becomes simpler to understand
new languages
Why do we need to understand programming languages?
- Technical jobs in computer science will inevitably involve
programming, which requires us to understand the languages we use.
- We might be called upon to choose a programming language for some
project. Making such a selection involves several issues, including
technological, sociological (such as training programmers) and
economic (such as re-use of existing code and programming
environments) considerations. Of these, we will focus mostly on
technological considerations.
- Finally, we might be in a position to build a new language. To be
successful, we must understand past efforts, current needs and key
technological ideas.
What does it mean to understand a programming language?
Let us illustrate the problem via an example. Consider the
following statement:
set x[i] to x[i] + 1
Example (continued)
- This is clearly intended to denote the increment of an array
element. How would we translate this statement to a variety of
different languages, and what would it mean?
- In C (circa 1970), we would write this as
x[i] = x[i] + 1;
- This performs a hardware lookup for the address of
x
and adds i
to it. The addition is a hardware operation,
so it is dependent upon the hardware in question. This resulting
address is then referenced (if it's legal - which it might not be),
1
is added to the bit-string stored there (again, as a
hardware addition, which can overflow), and the result is stored back
to that location. However, no attempt has been made to determine that
x
is even a vector and that x[i]
is a
number.
Example (continued)
Example (continued)
- Finally, in Java (circa 1991), one might write
x[i] = x[i] + 1;
- which looks identical to the C code. However, the actions performed
are those performed by the Scheme code, with one major difference: the
arithmetic is not as abstract. It is defined to be done as if the
machine were a 32-bit machine, which means we can always determine the
result of an operation, no matter which machine we execute the program
on, but we cannot have our numbers grow arbitrarily large.
What do we need to know to program in a language?
- There are three crucial components to any language. The
syntax of the language is a way of specifying what is legal in
the phrase structure of the language; knowing the syntax is analogous
to knowing how to spell and form sentences in a natural language like
English. However, this doesn't tell us anything about what the
sentences mean.
- The syntax of a language can be expressed in terms of a grammar
such as BNF
What we need to know (continued)
- The second component is the meaning, or semantics, of a
program in that language. Ultimately, without a semantics, a
programming language is just a collection of meaningless phrases;
hence, the semantics is the crucial part of a language.
- There 3 ways of expressing the semantics of a programming
language mentioned in your book.
- Denotational Semantics tells what is computed by giving a
mathematical object (typically a function) which is the meaning
of the program. Denotational semantics are used in comparitive
studies of programming langauges.
- Axiomatic Semantics defines the meaning of the program
implicitly. It makes assertions about relationships that hold at
each point in the execution of the program. Axioms define the
properties of the control structures and state the properties
that may be infered. A property about a program is deduced by
using the axioms. Each program has a pre-condition which
describes the initial conditions required by the program prior
to execution and a post-condition which describes, upon
termination of the program, the desired program property.
- Operational semantics tells how a computation is performed
by defining how to simulate the execution of the program.
Operational semantics may describe the syntactic transformations
which mimic the execution of the program on an abstract machine
or define a translation of the program into recursive
functions. Operational semantics are used when learning a
programming language and by compiler writers.
What we need to know (continued)
- Finally, as with natural languages, every programming language has
certain idioms that a programmer needs to know to use the
language effectively. This is sometimes referred to as the
pragmatics of the language. Idioms are usually acquired
through practice and experience, though research over the past few
decades has led to a better understanding of these issues.
Computer Language Levels
- Low-level - close to the machine instruction set, (e.g. machine
language, assembly language)
- High-level - generally English-like in nature, these make it
easier for humans to program, (e.g. C++, Pascal)
- Very high-level - give a general idea of what the computer should do
and let it do it (e.g. Lisp, Miranda)
Implementation Methods
In theory it is possible to construct a hardware computer to
execute directly program written in any particular programming
language. But practical considerations favor computers with low-level
machine languages, on the basis of speed, flexibility and cost.
The solution:
- compiled languages
- interpreted languages
- hybrid systems
Translation (compilation)
Translate from the high-level language to the host computer machine
language.
- compiler
- assembler
- linker or loader
- preprocessor
Example of translation
C source code:
void initialize(int U[], int size, int init)
{
int j;
for (j=0; j < size; ++j)
U[j]=init;
}
Example of translation (continued)
corresponding assembly language code:
# 1 void initialize(int U[], int size, int init)
; $16 holds U
; $17 holds size
; $18 holds init
initialize:
sextl $17, $17 ; sign-extend $17 to 64 bits
sextl $18, $18 ; sign-extend $18 to 64 bits
# 3 int j;
# 4
# 5 for (j=0; j < size; ++j)
ble $17, L$5 ; if $17 leq 0, go to L$5
clr $1 ; $1 holds j, $1 = 0
L$6:
addl $1, 1, $1 ; add one to $1
# 6 U[j]=init;
stl $18, ($16) ; store $18 in the location that $16 holds
cmplt $1, $17, $3 ; $3 = $1 lt $17
lda $16, 4($16) ; increment $16 to point to next element
bne $3, L$6 ; branch if j lt size
L$5:
ret ($26) ; $26 holds return address
Interpretation (software simulation)
- Simulate a computer whose machine language is the high-level
language
- Done by constructing a set of programs in the host computer
machine language that represent the algorithms necessary for execution
of programs in the high-level language
Hybrid Systems
- translate from a high-level language to some intermediate language
designed to allow easy interpretation
- faster than pure interpretation and provides portability
Computational Paradigms
The focus of this course is general purpose, high-level languages.
Machine-readable - unambiguous, finite algorithm to translate
language, not too complex, has context-free grammar
there are different paradigms of programming languages
(similar to different phyla in biology)
- Imperative Languages
- Procedural languages
- Object-Oriented Languages
- Parallel processing languages
- Declarative Languages
- Logic Programming languages
- Function Programming languages
- Database Languages
Imperative Languages
- sequential execution
- variables represent memory locations
- assignment used to change value of variables
Procedural Languages
- Nested blocks
- Procedures
- Scoping rules for variables
- Recursion
Object-Oriented Languages
- object is collection of memory locations and operations that can
change values at them
- opposite of functional programming - focuses on data structures
- computation is interaction and communication between objects
- each object behaves like a computer - own memory and operations
- classes defined using structured declarations
- objects are created as instances of a class
Parallel Processing Languages
- shared memory
- message passing
Functional Languages
- function theory
- parameter passing
- returned values
- no notion of data structures
- no looping (uses recursion)
Logic Programming Languages
- based on symbolic logic
- no loops or other control structures
- describe what is true about desired result
Language Criteria
How to evaluate the features of the various languages
- readability
- orthogonality
- well-defined descriptions (syntax considerations)
- provability (semantics considerations)
- reliability
- efficiency (cost)
- expressibility (writability)
- extensibility