NCTUCS 2013-Fall Computer Organizaion by ...

NCTUCS 2013-Fall Computer Organizaion by Professor Kai-Chiang Wu

Ch5 - Large and Fast Exploiting Memory Hierarchy

Memory Technology

  • SRAM
  • DRAM
  • Williams-Kilburn tubes: 早期的 Memory 技術

Memory Hierarchy Levels

教授:我去美國唸書的時候,教授都把 Cache Memory 直接簡寫成 $
所以之後我寫 $ 也是代表 Cache Memory

  • $ 在設計的時候通常都是跟 CPU 一起設計的,雖然他不在 CPU 裡面
  • CPU 透過 Mother board 上的 Bus 和 $ 交流資料
  • 根據 Instruction Set 決定放在 $ 裡面的 Address 佔幾個 Word. (也就是設計 Block 的大小)

Cache Memory

  • 除了 Register 以外,最接近 CPU 的 Memory

Direct Mapped Cache

因為主記憶體比 Cache 大很多,無法直接全部對應到 Cache,所以必須有特殊的對應方法。

  • Direct mapped: only one choice
    • (Block address) modulo (#Blocks in cache)
      • Block 的位置除以 Cache 的 Block 總數後得到的餘數就是該 Block Address 要對應到的 Cache Address
      • 其實沒有真的去作除法,直接看後面 3 個 bit (除以8得到的餘數,因為這裡的 cache 有 8 個)
  • N-way associative: N choices
    • Blocks is a power of 2

    • Use low-order address bits
    • tag, index, offset(W.O., B.O.)
      • tag 代表了從 cache 如何找到該資料是存在哪個 block address
      • index 代表了該 block address 存在 cache 中的位置
      • offset 根據 block 大小不同而訂
        • W.O.: Word offset
        • B.O.: Byte offset

Tags and Valid Bits

  • Tag:
    • 從 cache 如何找到該資料是存在哪個 block address
    • cache 也要存 tag,之後才知道要去找哪個 block
  • Valid Bit
    • 該 cache 無資料 => 0
    • 該 cache 有資料 => 1

Cache Example

見投影片 p.10 ~ p.15

  • 22 => 10110
    • tag => 10
    • index => 110
    • Hit/miss
      • 要找的東西的 tag 和 index 有沒有已經存在 cache 中(tag 和 index 皆相同)
      • 沒的話就是 miss, 有的話就是 hit

Address Subdivision

見 p.16

  • byte address
  • 2^n blocks
  • block data size 2^m words


Address Translation

  • Virtual Memory

Page Fault Penalty

  • Page Fault means that the data we find is not in the phyiscal address but disk storage.
  • Virtual Memory 經過 page table 查找後,無法在 Physical Address 找到,則為 Page Fault。
  • Page Fault 類似 cache 裡面的 miss
  • Millions of clock cycles (比 cache 的 panelty 大)

Page Tables


  • Virtual Address (VA) -> Page Table (PT) -> Physical Address (PA)
  • page table entries (PTEs)
  • page table register in CPU points to page table in physical memory
  • status bits
    • referenced
    • dirty - 先標記起來,之後讀寫
    • valid - 是不是真的有資料
      • If valid bit is zero, the page fault occurs and the data is not in the physical address but disk storage.
  • Every Page Table entry is 4 bytes.

Translation Using a Page Table


  • We can know the amount of page offset by knowing 4 bytes per page table entry.
  • Virtual address(32 bits) = Virtual page number(32-x bits) + Page Offset(x bits)
    • x depends on the size of the page.
    • x is usually 12bits. (4KB per page)

Replacement and Writes

  • use least-recently used (LRU) to reduce page fault rate.
  • use dirty bit in PTE set when page is written to reduce the access to main memory.

Page Table Problems

  • page table is too big
  • Access to page table is too slow

Fast Translation Using a TLB

  • VA -> PT (in main memory) / TLB (in CPU for usually used) -> PA
  • Translation Look-aside Buffer (TLB)
    • use a fast cache of PTEs within the CPU
    • access to page tables has good time locality
    • Extra misses occured when query in TLB failed. (queried data is in PT)
    • Those Misses could be handled by hardware or software



如果覺得這篇文章對你有幫助, 除了留言讓我知道外, 或許也可以考慮請我喝杯咖啡, 不論金額多寡我都會非常感激且能鼓勵我繼續寫出對你有幫助的文章。

If this blog post happens to be helpful to you, besides of leaving a reply, you may consider buy me a cup of coffee to support me. It would help me write more articles helpful to you in the future and I would really appreciate it.

Related Posts