Final exam -- Open book section

8 May, 1996

The entire exam is to be turned in at 2:45 PM. Work the closed book section first and turn it in before you consult your books and notes to work on the open book section.

Problem 1. (5 points)

Translate the following expression into RPN:

4 * (x + c * d)

Problem 2. (10 points)

Show how the Booth hardware algorithm, as illustrated in Figure 10-8 (p. 345) and Table 10-3 (p. 346) of the textbook, can be used to multiply the two six-bit numbers 110110 and 010011.

Problem 3. (5 points)

Name two ways in which the Pentium Pro implementation is a RISCy approach to computer architecture.

Problem 4. (10 points)

The on-chip cache of the PowerPC 601 looks for data addressed by 32-bit addresses. The Technical Summary of PowerPC 601 states that:

The PowerPC 601 microprocessor contains a 32 Kbyte, eight-way set associative, unified (instruction and data) cache. The cache line size is 64 bytes, ....

The PowerPC indexes memory in units of bytes. Show how the PowerPC 601 addresses are divided into tag, block, and word fields as was done in the textbook. In this case, a "word" is a byte.

Now make a high-level drawing of the 601's cache organization.

How many bits are needed to keep up with all the tags?

By the way, the PowerPC cache really returns data in 32 byte sectors. The cache line holds two of these sectors. However, you may ignore reality in your answer.

Problem 5. (5 points)

Today programmers are taught to use lots of small procedures in writing "nice" code. As a designer of high-performance hardware caches how do you feel about this advice?

Problem 6 (5 points)

In some processors with memory-mapped I/O, the addresses generated by the processor are of different sizes than the addresses used on popular I/O buses. One such mismatch would be a PowerPC 620 microprocessor, with a 40-bit address space, and a PCI bus, with a 32-bit address space. Discuss (or speculate upon) potential problems and solutions for these mismatched machines. Here are a few things that you might worry about: bridges (I/O interface units), control and status registers, DMA.

Problem 7. (10 points)

Reconsider (from the last midterm) the following loop for storing the ten thousand squares of the integers from 1 to 10,000 into an array SQ.

               DO 100 N =1, 10000
          100  SQ(N) = N*N

Suppose this code is rewritten as follows

               SQ(1) = 1
               DO 100 N =2, 10000
          100  SQ(N) = SQ(N-1) + N + N - 1

Although both code sequences set SQ correctly, there could be significant differences in their performance. Which would you expect to be faster on a superpipelined computer? What sort of additional information would you need to know about the computer to give a good answer to this question?

Back to the Handout index
Return to Dean Brock's home page
Return to the UNCA Computer Science home page