⚙️ CPU Organization
The internal architecture of a CPU — how the ALU, control unit, registers, and buses work together to fetch, decode, and execute instructions.
CPU Major Components
The CPU at a Glance
Internal CPU Block Diagram
CPU Internal Organization
Click a component above to highlight it in the diagram
Data Path: How Data Flows
Data Path Definition
Interactive Data Path Visualizer
Click on a data path element to trace the flow
CPU Operation: Fetch-Decode-Execute Cycle
Fetch
PC → MAR → Memory → MDR → IR
Decode
IR[opcode] → Control Unit → Control Signals
Execute
ALU performs operation on register operands
Memory
Load/Store: access data memory via MAR/MDR
Write Back
Result written back to register file
Step-by-Step: ADD R1, R2, R3
1. Fetch
PC → MAR. Memory read. MDR → IR. PC = PC + 4.
2. Decode
IR[31:26] → opcode. Control unit generates ALU control signals. IR[25:21] = R2, IR[20:16] = R3, IR[15:11] = R1.
3. Execute
Register file outputs R2, R3 values to ALU inputs. ALU adds them. Result appears on ALU output.
4. Memory
No memory access needed for ADD (ALU operation). Stage is idle/write-through.
5. Write Back
ALU output written to register file at address R1.
Single Bus vs Multi-Bus Organization
Single Bus Organization
Register Transfer Language (RTL)
What is RTL?
plaintext
// Register Transfer Level description of ADD instruction
// Notation: [register] ← [source operation]
// FETCH CYCLE
T1: MAR ← PC // Send PC address to memory
PC ← PC + 4 // Prepare for next instruction
T2: MDR ← M[MAR] // Read instruction from memory
IR ← MDR // Load instruction into IR
// DECODE CYCLE
T3: IR[opcode] → Control Unit
A ← Reg[IR[25:21]] // Read R2 into A register
B ← Reg[IR[20:16]] // Read R3 into B register
// EXECUTE CYCLE
T4: ALUOut ← A + B // ALU performs addition
// WRITE BACK CYCLE
T5: Reg[IR[15:11]] ← ALUOut // Store result to R1Real-World Example: Intel & AMD CPU Organization
Intel Core (x86-64)
- • Complex control unit: microcode ROM for x86 decode → μops
- • Deep out-of-order execution engine with reorder buffer (ROB)
- • Split L1 cache: 32KB instruction + 32KB data (8-way associative)
- • Ring bus interconnect between cores and LLC (last-level cache)
- • Multiple ALUs: 4 integer + 2/3 vector (AVX-512) per core
AMD Ryzen (Zen)
- • CCX (Core Complex): 4 cores sharing 16MB L3 cache
- • Unified L2 cache (512KB per core), prefetch units
- • Infinity Fabric interconnect between CCX complexes
- • SMT (Simultaneous Multithreading): 2 threads per core
- • 4-wide instruction decode, 6 ALU dispatch per cycle
Key Difference
Interview Questions
Explain the major components of a CPU and their roles.
The CPU has four major components: (1) ALU — performs arithmetic and logic operations on data from registers. (2) Control Unit — decodes instructions and generates control signals that direct data flow between all components. (3) Register File — fast temporary storage including general-purpose registers, PC (program counter), and IR (instruction register). (4) Bus System — carries data, addresses, and control signals between CPU, memory, and I/O.
What is the fetch-decode-execute cycle? Describe each stage in detail.
The CPU continuously repeats: (1) Fetch — PC supplies address to MAR, memory read brings instruction to MDR, then IR. PC increments by 4. (2) Decode — control unit interprets IR opcode, generates signals, selects register operands. (3) Execute — ALU performs the operation (add, sub, and, etc.) on register values. (4) Memory — for load/store instructions, access data memory via MAR/MDR. (5) Write Back — ALU result or loaded data written to register file.
Compare single-bus and multi-bus CPU organization.
Single-bus: all components share one bus for data, address, and control. Only one transfer per cycle — simple and cheap but causes bus contention and limits throughput. Multi-bus: separate buses for data/address/control, often with local (CPU-cache) and system (memory-I/O) buses. Allows parallel transfers and higher throughput at the cost of more complex hardware and bridge logic between bus domains.
What is Register Transfer Language (RTL) and why is it important?
RTL describes CPU behavior at the register-transfer level — specifying data movements between registers and functional units each clock cycle. It bridges the gap between the instruction set architecture and the hardware implementation. RTL notation like 'MAR ← PC' or 'R1 ← R2 + R3' precisely describes operations and timing, forming the basis for hardware synthesis and digital design.