Lecture #09 (12 February 2002)

Programming Languages and Program Translation


Overall Reading
Brookshear: pp. 226-229, 243-246, 255-262
Decker/Hirshfield: Mod. 6.1, pp. 216-218, Mod. 6.4

Outline:

  • What's wrong with machine code or assembly language?
  • Designing a high-level programming language
  • Program Translation
  • Lexical Analysis (Scanning)
  • Parsing
  • Code Generation
  • Past and Present Programming Languages

  • What's wrong with machine code or assembly language?

    Machine Language is expressive enough to design any of the programs you have ever seen run on a computer. If fact, every program you have ever run was actually running as machine code; that is the only language which the processor understands!

    However, though we are able to write programs in a given machine language, it is extremely tedious. Writing programs to accomplish "simple" tasks already take time to write. In seems almost impossible to think about sitting down to write a more complicated piece of software, such as a word processor, a database, computer animation.

  • programs need hundreds, thousands, or millions of individual instructions.

  • You cannot design a program which works on various CPU types because those CPU's have DIFFERENT instruction sets (e.g. Pentium, PowerPC, Sparc, etc). You would have to write separate versions of every program on every machine type.

  • tough to find mistakes (can't see the forest from the trees)

  • Tougher to fix them.
    Example: what if you realize you need to insert one extra instruction into the middle of the program?
  • must slide all later instructions one spot farther into memory
  • this also means that the addresses for some 'jumps' are no longer accurate (though many instruction sets allow for a jump which is expressed in relative terms rather than absolute terms).
  • Reusing code for related purposes is not easy.

  • Designing a high-level programming language

    There are two key requirements to balance when designing a high-level language:

    1. It should be easier for humans
    2. Still understandable by a machine.
    In this section, we talk about the first of these issues. In the following section, we will address the second.

    Easier for Humans

    If a language can be designed to be more intuitive for humans, it makes writing, reading and maintaining programs easier. This should make program development as productive and efficient as possible.

    Example: Let W equal the maximum of X and Y.

      if (X<Y) then
        W = Y
      else
        W = X
    
    vs.
     0 LOD X
     2 SUB Y
     4 STO T4
     6 CPL T4
     8 JMZ 14
    10 LOD Y
    12 JMP 16
    14 LOD X 
    16 STO W
    18 HLT
    

    Other common Control Structures

  • while loop
  • case analysis
  • for loop

  • Figure 5.7 of Brookshear text


    Figure 5.8 of Brookshear text


    Program Translation

    Language syntax must be defined formally so that this program can be translated (automatically) to machine language.


    Figure 5.12 of Brookshear text

  • Lexical Analysis (Scanning)
    When a translator first reads a line of a program, it must take the string of individual characters, and recognize how to group them into 'tokens' appropriately.

    For instance, it must recognize 'if' as a keyword in the language. Or it must recoginze '235' as a single (decimal) number, not three unrelated characters of text.

  • Parsing
    Before converting to true machine code, a translator tries to determine the syntactical structure of the high-level commands. That is, it has determined the various tokens, but it must consider the possibly complex order of these tokens to understand the high-level code.

    As with human languages, programming languages are often defined with very specific grammers and syntax. Even more important that such a grammer avoids ambiguity.

    Example: Set X to its absolute value.

      if (X<0) then
        X = -X
    

    Example: Let W equal the maximum of X and Y.

      if (X<Y) then
        W = Y
      else
        W = X
    


    Figure 5.13 of Brookshear text

    Example with Ambiguity?

      if B1 then if B2 then S1 else S2
    
    What is the intended behavior?

    B1B2behavior
    truetrueS1
    truefalse?
    falsetrue?
    falsefalse?

  • Code Generation
    The parsing was really the difficult work. Once the translator understands the structure, it can use a set of rules to generate actual machine code.

    Let's revisit some of the common Control Structures:

  • if/then
  • if/then/else
  • for loop
  • while loop

  • Past and Present Programming Languages


    Figure 5.2 of Brookshear text

    There are many languages, each of which has its own strengths, weaknesses and history for developing software in particular settings. To really understand many of the differences, it is necessary to try writing programs in various languages. We will not do so in this course, but for those who want to get a feel for some of the issues, you can read pp. 230-235 and Ch. 5.5-5.7 and App. D of [Br].


    comp150 Class Page
    mhg@cs.luc.edu
    Last modified: 12 February 2002