Previous Page                                                               Home                                                               Next Page

Code generation issues                                                                                                             Basic blocks and flowgraphs

 
 
 

RUN-TIME STORAGE MANAGEMENT

 

                     The semantics of procedures in a language determines how names are bound to storage during allocation. Information needed during an execution of a procedure is kept in a block of storage called an activation record; storage for names local to the procedure also appears in the activation record.

 

                     An activation record for a procedure has fields to hold parameters, results, machine-status information, local data, temporaries and the like. Since run-time allocation and de-allocation of activation records occurs as part of the procedure call and return sequences, we focus on the following three-address statements:

1.      call

2.      return

3.      halt

4.      action, a placeholder for other statements

 

                     For example, the three-address code for procedures c and p in fig. 4 contains just these kinds of statements. The size and layout of activation records are communicated to the code generator via the information about names that is in the symbol table. For clarity, we show the layout in Fig. 4 rather than the form of the symbol-table entries.

We assume that run-time memory is divided into areas for code, static data, and a stack.

                   

                   

                     three-address code             activation record for c                   activation record for p

                                                                          (64 bytes)                                          (64 bytes)

return address

 

return address

 

/* code for c*/

     action1

     call p

    action2

      halt

 
                                                                                                    

                                                     0:                                                              0:

                                                     8:                                              0:

 

         buf

 

 

       arr

 
                                                     8:                                              4:   

/* code for p*/

     action3

     return

 
 


         i

 

        i

 
            

         n

 
                                                    56:

        j

 
                                                                                                                                      84:   

                                                                      60:

 

 

                                                 fig 4 . Input to a code generator

 

STATIC ALLOCATION

 

                     Consider the code needed to implement static allocation. A call statement in the intermediate code is implemented by a sequence of two target-machine instructions. A MOV instruction saves the return address, and a GOTO transfers control to the target code for the called procedure:

 

MOV     #here +20, callee.static_area

GOTO    callee.code_area

 

                     The attributes callee.statatic_area and callee.code_area are constants referring to the address of the activation record and the first instruction for the called procedure, respectively. The source #here+20 in the MOV instruction is the literal return address; it is the address of instruction following the GOTO instruction.

                     The code for a procedure ends with a return to the calling procedure ends with a return to the calling procedure, except the first procedure has no caller, so its final instruction is HALT, which presumably returns control to the operating system. A return from procedure callee is implemented by

 

  GOTO    *callee.static_area

which transfers control to the address saved at the beginning of the activation record.

 

Example 1: The code in Fig. 5 is constructed from the procedures c and p in Fig. 4. We use the pseudo-instruction ACTION to implement the statement action, which represents three-address code that is not relevant for this discussion. We arbitrarily start the code for these procedures at addresses 100 and 200, respectively, and assume that each ACTION instruction takes 20 bytes. The activation records for the procedures are statically   allocated starting at location 300 and 364, respectively.

 

                         

                                                                        /*code for c*/

            

             100:    ACTION1

             120:    MOV   #140,364                  /*save return address 140 */

             132:     GOTO  200                         /* call p */

             140:    ACTION2

             160:     HALT

                           ……

                                                                      /*code for p*/           

             200:     ACTION3

             220:     GOTO  *364                        /*return to address saved in location 364*/

                           ……

                                                                     /*300-363 hold activation record for c*/

             300:                                                 /*return address*/             

             304:                                                 /*local data for c*/

    

                           ……                                  /*364-451 hold activation record for p*/          

             364:                                                 /*return address*/

             368:                                                 /*local data for p*/

 

                                 fig 5.  Target code for input in fig 4.

 

The instructions starting at address 100 implement the statements

 

  action1 ; call p; action2; halt

 

of the first procedure c. Execution therefore starts with the instruction ACTION1 at address 100. The MOV instruction at address 120 saves the return address 140 in the machine-status field, which is the first word in the activation record of p. The GOTO instruction at address 132 transfers control to the first instruction is the target code of the called procedure.

 Since 140 was saved at address 364 by the call sequence above, *364 represents 140 when the GOTO statement at address 220 is executed. Control therefore returns to address 140 and execution of procedure c resumes.

 

 

STACK ALLOCATION

                     Static allocation can become stack allocation by using relative addresses for storage in activation records. The position of the record for an activation of a procedure is not known until run time. In stack allocation, this position is usually stored in a register, so words in the activation record can be accessed as offsets from the value in this register. The indexed address mode of our target machine is convenient for this purpose.

                     Relative addresses in an activation record can be taken as offsets from any known position in the activation record. For convenience, we shall use positive offsets by maintaining in a register SP a pointer to the beginning of the activation record on top of the stack. When a procedure call occurs, the calling procedure increments SP and transfers control to the called procedure. After control returns to the caller, it decrements SP, thereby de-allocating the activation record of  the called procedure.

                     The code for the 1st procedure initializes the stack by setting SP to the start of the stack area in memory.

 

MOV   #stackstart, SP          /*initialize the stack*/

code for the first procedure

HALT                                    /*terminate execution*/

 

                     A procedure call sequence increments SP, saves the return address, and transfers control to the called procedure:

 

ADD   #caller.recordsize, SP

MOV   #here+16, SP                   /* save return address*/

GOTO   callee.code_area  

 

                     The attribute caller.recordsize represents the size of an activation record, so the ADD instruction leaves SP pointing to the beginning of the next activation record. The source #here+16 in the MOV instruction is the address of the instruction following the GOTO; it is saved in the address pointed to by SP.

The return sequence consists of two parts. The called procedure transfers control to the return address using

 

GOTO   *0(SP)            /*return to caller*/

 

The reason for using *0(SP) in the GOTO instruction is that we need two levels of indirection: 0(SP) is the address of the first word in the activation record and *0(SP) is the return address saved there.

The second part of the return sequence is in the caller, which decrements SP, thereby restoring SP to its previous value. That is, after the subtraction SP points to the beginning of the activation record of the caller:

 

SUB #caller.recordsize, SP

 

Example 2

The program in fig .6 is a condensation of the three-address code for the Pascal program for reading and sorting integers. Procedure q is recursive, so more than one activation of q can be alive at the same time.

        

 

/*code for q*/

     action4

      call p

      action5

      call q

      action6

      call q

      return

 

/*code for p*/

    action3

     return

 

/*code for s*/

   action1

   call q

   action2

    halt

 
              

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

                   fig.6 Three-address code to illustrate stack allocation

           

                    

                     Suppose that the sizes of the activation records for procedures s, p, and q have been determined at compile time to be ssize, psize, and qsize, respectively. The first word in each activation record will hold a return address. We arbitrarily assume that the code for these procedures starts at addresses 100,200 and 300 respectively, and that the stack starts at 600. The target code for the program in

fig. 6 is as follows:

 

                                                   /*code for s*/          

          100:  MOV   #600, SP    /*initialize the stack*/

          108:  ACTION1

          128:  ADD  #ssize, SP     /*call sequence begins*/

          136:  MOV  #152, *SP     /*push return address*/

          144:  GOTO  300            /*call q*/

          152:  SUB  #ssize, SP      /*restore SP*/

          160:  ACTION2

          180:  HALT

                   ………     

                                                    /*code for p*/

          200:  ACTION3

          220:  GOTO  *0(SP)         /*return*/

                   ………

                                                   /*code for q*/

          300:  ACTION4              /*conditional jump to  456*/

          320:  ADD  #qsize, SP

          328:  MOV  #344, *SP     /*push return address*/

          336:  GOTO  200            /*call p*/

          344:  SUB  #qsize, SP

          352:  ACTION5

          372:  ADD  #qsize, SP

          380:  MOV  #396, *SP     /*push return address*/

          388:  GOTO  300              /*call q*/

          396:  SUB  #qsize, SP

          404:  ACTION6

          424:  ADD  #qsize, SP

          432:  MOV  #448, *SP      /*push return address*/

          440:  GOTO  300               /*call q*/

          448:  SUB #qsize, SP

          456:  GOTO  *0(SP)          /*return*/

                     ………

 

          600:                                     /*stack starts here*/

 

     

                    

                     We assume that ACTION4 contains a conditional jump to the address 456of the return sequence from q; otherwise, the recursive procedure q is condemned to call itself forever. In an example below, we consider an execution of the program in which the first call of q does not return immediately, but all subsequent calls do.

                     If ssize, psize, and qsize are 20,40, and 60, respectively, then SP is initialized to 600, the starting address of the stack, by the first instruction at address 100. SP holds 620 just before control transfers from s to q, because ssize is 20. Subsequently, when q calls p, the instruction at address 320 increments SP to 680, where the activation record for p begins; SP reverts to 620 after control returns to q. If the next two recursive calls of q return immediately, the maximum value of SP during this execution is 680. However, the last stack location used is 739, since the activation record for q starting at location 680 extends for 60 bytes.

 

 

 

 RUN-TIME ADDRESSES FOR NAMES

 

                     The storage allocation strategy and the layout of local data in an activation record for a procedure determine how the storage for names is accessed.

If we assume that a name in a three-address statement is really a pointer to a symbol-table entry for the name; it makes the compiler more portable, since the front end need not be changed even if the compiler is moved to a different machine where a different run-time organization is needed. On the other hand, generating the specific sequence of access steps while generating intermediate code can be of significant advantage in an optimizing compiler, since it lets the optimizer take advantage of details it would not even see in the simple three-address statement.

                     In either case, names must eventually be replaced by code to access storage locations. We thus consider some elaborations of the simple three-address statement x := 0. After the declarations in a procedure are processed, suppose the symbol-table entry for x contains a relative address 12 for x. First consider the case in which x is in a statically allocated area beginning at address static. Then the actual run-time address for x is static+12. Although, the compiler can eventually determine the value of static+12 at compile time, the position of the static area may not be known when intermediate code to access the name is generated. In that case, it makes sense to generate three-address code to “compute” static+12, with the understanding that this computation will be carried out during the code-generation phase, or possibly by the loader, before the program runs. The assignment x := 0 then translates into

 

     static[12] := 0

If the static area starts at address 100, the target code for this statement is

 

    MOV  #0, 112

                     On the other hand, suppose our language is one like Pascal and that a display is used to access non-local names. Suppose also that the display is kept in registers, and that x is local to an active procedure whose display pointer is in register R3. Then we may translate the copy x := 0 into the three-address statements

 

     t1 := 12+R3

     *t1 := 0

in which t1 contains the address of x. This sequence can be implemented by the single machine instruction

 

       MOV  #0, 12 (R3)

The value in R3 cannot be determined at compile time.

 

 


Previous Page                                                                           Home                                                               Next Page

Code generation issues                                                                                                             Basic blocks and flowgraphs