ArchitectureCache Memory

⚡ Cache Memory

Cache is a small, fast memory that stores copies of frequently accessed data. Learn about cache organization, mapping techniques, and performance optimization.

What is Cache Memory?

ℹ️

Definition

Cache memory is a small-sized type of volatile computer memory that provides high-speed data access to the processor. It stores frequently used program instructions and data to reduce the average time to access data from the main memory.

L1 Cache

16-128 KB

~1 ns

L2 Cache

128 KB - 1 MB

~5 ns

L3 Cache

1-32 MB

~15 ns

Cache Mapping Techniques

💡

Key Concept

Cache mapping determines how memory blocks are placed in cache lines. The three main techniques are Direct Mapping, Associative Mapping, and Set-Associative Mapping.

Direct Mapping

Each memory block maps to exactly one cache line. Simple but leads to conflicts.

Associative

Any block can go to any line. Flexible but requires complex hardware.

Set-Associative

Hybrid: block maps to a set of lines. Best trade-off.

Cache Performance Metrics

Hit Rate

Percentage of accesses found in cache. Typical L1 hit rate: 90-95%.

Miss Rate

Percentage of accesses NOT found in cache = 1 - Hit Rate.

Hit Time

Time to access cache when data is found (typically 1-2 CPU cycles).

Miss Penalty

Extra time needed to fetch data from next level when cache misses.

Average Access Time

Hit Time + (Miss Rate × Miss Penalty). Goal is to minimize this.

Locality

Temporal: reuse same data. Spatial: use nearby data. Cache exploits both.

Cache Write Policies

Write-Through

Write to both cache AND main memory simultaneously. Simple but slow for writes.

Pros: Data consistency, simpleCons: Slow writes, high bus traffic

Write-Back

Write only to cache. Write back to memory when block is evicted.

Pros: Fast writes, less trafficCons: Complex, risk of data loss

Cache Access Flow

CPU Request

→

Check Cache

→

Hit?

→

Return Data

→

Miss!

Sample Code: Cache Simulator

python

class CacheLine:
    def __init__(self, size=64):
        self.valid = False
        self.tag = 0
        self.data = [0] * size
        self.last_access = 0

class DirectMappedCache:
    def __init__(self, num_lines=64, line_size=64):
        self.lines = [CacheLine(line_size) for _ in range(num_lines)]
        self.hits = 0
        self.misses = 0

    def access(self, address):
        block_size = self.lines[0].data.__len__()
        line_index = (address // block_size) % len(self.lines)
        tag = address // (block_size * len(self.lines))
        line = self.lines[line_index]
        
        if line.valid and line.tag == tag:
            self.hits += 1
            return line.data[address % block_size]  # Cache HIT
        else:
            self.misses += 1
            line.valid = True
            line.tag = tag
            return None  # Cache MISS - fetch from memory

Interview Questions

What is the difference between cache hit and cache miss?

A cache hit occurs when the CPU finds requested data in cache. A cache miss occurs when data is not in cache and must be fetched from main memory, incurring significant penalty. The goal is to maximize hit rate through good locality and mapping.

Explain temporal and spatial locality.

Temporal locality: recently accessed data is likely to be accessed again soon (e.g., loop variables). Spatial locality: data near recently accessed data is likely to be accessed soon (e.g., array elements). Cache exploits both: temporal via keeping data, spatial via fetching cache lines (blocks).

Why is set-associative mapping preferred over direct mapping?

Direct mapping causes conflict misses when multiple frequently used blocks map to the same line. Set-associative mapping reduces conflicts by allowing a block to map to multiple lines in a set. 2-way to 8-way set associative gives most of the benefit of fully associative with much simpler hardware.