⥠Cache Memory
Cache is a small, fast memory that stores copies of frequently accessed data. Learn about cache organization, mapping techniques, and performance optimization.
What is Cache Memory?
Definition
L1 Cache
16-128 KB
~1 ns
L2 Cache
128 KB - 1 MB
~5 ns
L3 Cache
1-32 MB
~15 ns
Cache Mapping Techniques
Key Concept
Direct Mapping
Each memory block maps to exactly one cache line. Simple but leads to conflicts.
Associative
Any block can go to any line. Flexible but requires complex hardware.
Set-Associative
Hybrid: block maps to a set of lines. Best trade-off.
Cache Performance Metrics
Hit Rate
Percentage of accesses found in cache. Typical L1 hit rate: 90-95%.
Miss Rate
Percentage of accesses NOT found in cache = 1 - Hit Rate.
Hit Time
Time to access cache when data is found (typically 1-2 CPU cycles).
Miss Penalty
Extra time needed to fetch data from next level when cache misses.
Average Access Time
Hit Time + (Miss Rate à Miss Penalty). Goal is to minimize this.
Locality
Temporal: reuse same data. Spatial: use nearby data. Cache exploits both.
Cache Write Policies
Write-Through
Write to both cache AND main memory simultaneously. Simple but slow for writes.
Write-Back
Write only to cache. Write back to memory when block is evicted.
Cache Access Flow
Sample Code: Cache Simulator
python
class CacheLine:
def __init__(self, size=64):
self.valid = False
self.tag = 0
self.data = [0] * size
self.last_access = 0
class DirectMappedCache:
def __init__(self, num_lines=64, line_size=64):
self.lines = [CacheLine(line_size) for _ in range(num_lines)]
self.hits = 0
self.misses = 0
def access(self, address):
block_size = self.lines[0].data.__len__()
line_index = (address // block_size) % len(self.lines)
tag = address // (block_size * len(self.lines))
line = self.lines[line_index]
if line.valid and line.tag == tag:
self.hits += 1
return line.data[address % block_size] # Cache HIT
else:
self.misses += 1
line.valid = True
line.tag = tag
return None # Cache MISS - fetch from memoryInterview Questions
What is the difference between cache hit and cache miss?
A cache hit occurs when the CPU finds requested data in cache. A cache miss occurs when data is not in cache and must be fetched from main memory, incurring significant penalty. The goal is to maximize hit rate through good locality and mapping.
Explain temporal and spatial locality.
Temporal locality: recently accessed data is likely to be accessed again soon (e.g., loop variables). Spatial locality: data near recently accessed data is likely to be accessed soon (e.g., array elements). Cache exploits both: temporal via keeping data, spatial via fetching cache lines (blocks).
Why is set-associative mapping preferred over direct mapping?
Direct mapping causes conflict misses when multiple frequently used blocks map to the same line. Set-associative mapping reduces conflicts by allowing a block to map to multiple lines in a set. 2-way to 8-way set associative gives most of the benefit of fully associative with much simpler hardware.