# 24 April, 2000

## I'm not making this up

Suppose a program executes

• y = x * a + b

where x is stored at address 0x345205A8 or binary 00110100010100100000010110101000.

## Look in the L1 cache

### Cache structure

• cache is 16k bytes
• there are 512 entries
• each cache entry is 32 bytes
• 5 bits are required to address 32 bytes
• 27 bits remain for the tag
27-bit tag 32 bytes of data
011101011101001000000100101 88 99 AA BB CC ...
101101000101001000110101101 57 63 FF 00 CE ...
001101000101001000000101101 00 00 00 00 FF ...
110010101101001001011101101 44 65 61 6e 20 ...

### Cache action

• divide address into tag (27 bits) and offset (5 bits)
• tag = 001101000101001000000101101
• offset = 01000
• look for the tag in the cache
• if found
• retrive the 16 bytes of data
• use offset to return appropriate 4 bytes of the 16
• look in L2, L3, .... caches

### N-way set associative

Most P6 caches are 4-way set associative.

• 512 entries
• 4 entries in each set
• therefore, 128 sets
• covering 9 of the 27 "tag" bits
• divide address into tag (18 bits), set (9 bits), and offset (5 bits)
• tag = 001101000101001000
• set = 000101101
• offset = 01000
9 bit
Entry 1 Entry 2
18 bit tag 32 bytes of data 18 bit tag 32 bytes of data
000000000 011101011101001000 4A 61 6E 65 ... 011001001101001000 74 20 00 00 ...
....
000101101 001101000101001000 53 65 62 65 ... 001111011101001000 63 63 61 20 ...
....
110101101 101101000101001000 47 68 69 6C ... 101110000101001000 6D 61 6E 20 ...
....
111111111 110010101101001001 44 65 61 6e ... 001000100101001001 20 00 00 00 ...

## Look in the L2 cache

Same idea as the L1 cache.

If address is not in any cache, go the RAM!

Many programs could be using the same address 345205A8.

from Overview of the Protected Mode Operation of the Intel Architecture by Steve Gorman.

Dir Page Offset
0011010001 0100100000 010110101000
105 288 1448

Compute the page frame as CR3[105][288]

Frame Offset
CR3[105][288] 1448
00001010101010101010 010110101000

But if CR3[105][288] is "empty", the page must be read in from the disk. This should be an extremely rare occurence.

1 lookup in the page directory + 1 lookup in the page table + 1 access to RAM is two memory accesses too many.

Translation Look aside Buffer
TLB
page number frame number
00110111110100100011 00001011110101001100
00110100010010111000 00011110101000000000
00110100010100100000 00001010101010101010
00111101110100000110 00000000001111101010

Look up the page number in the TLB. If there, use it. If not there, go to the page tables and add new lookup into the TLB. Cache misses should be rare! Try the command vmstat on your favorite Unix computer.

P6 TLB's
Type Entries
Data -- 4-kbyte pages 64
Instruction -- 4-kbyte pages 32
Data -- large pages 8
Instruction -- large pages 2