ArchitectureInstruction Formats

📋 Instruction Formats

How instructions are encoded: field layouts, opcodes, operands, and the difference between fixed and variable-length encodings.

What are Instruction Formats?

ℹ️

Binary Encoding of Operations

An instruction format defines how the CPU interprets the bits of an instruction. Each format divides the instruction into fields: opcode (what to do), operands (who to do it to/with), and optional modifiers. The CPU decodes these fields to generate control signals.

MIPS uses three primary formats, each 32 bits wide. The opcode field (bits 31-26) identifies the format and operation. R-type uses a secondary funct field when opcode is 000000.

MIPS Instruction Formats

Field Layout

opcode (6) | rs (5) | rt (5) | rd (5) | shamt (5) | funct (6)

Example

ADD $1, $2, $3 → 000000 00010 00011 00001 00000 100000

Width: 32-bitUse: Arithmetic & logical operations

R-type Instruction Bit Layout

opcode

shamt

funct

31-26

25-21

20-16

15-11

10-6

5-0

6 bits

5 bits

6 bits

Opcode and Funct Tables

Common Opcodes

000000ADD/SUB/AND/OR (R-type)ALU operations, funct field selects exact operation

001000ADDIAdd immediate

100011LWLoad word from memory

101011SWStore word to memory

000100BEQBranch if equal

000010JJump to address

Common Funct Codes (R-type)

100000ADDAdd rd = rs + rt

100010SUBSubtract rd = rs - rt

100100ANDBitwise AND

100101ORBitwise OR

101010SLTSet on less than

Fixed vs Variable Length Instructions

Feature	Fixed-Length	Variable-Length
Instruction Length	Fixed (32-bit)	1-15 bytes
Encoding Complexity	Simple, uniform	Complex, variable
Decoding Hardware	Simple decoder	Complex microcode decoder
Pipeline Impact	Easy to pipeline	Harder to pipeline
Code Density	Lower density	Higher density
Example Architectures	MIPS, ARM, RISC-V	x86, 68000

💡

Modern Compromise

ARM's Thumb-2 and RISC-V's compressed instruction extension (RVC) offer a compromise: they use 16-bit encodings for common instructions alongside 32-bit instructions, improving code density while keeping decoding relatively simple.

Encoding & Decoding Example

// Encoding: ADD $9, $10, $11
// R-type: opcode(000000) rs(01010) rt(01011) rd(01001) shamt(00000) funct(100000)
// Binary: 000000 01010 01011 01001 00000 100000
// Hex: 0x014B4820

uint32_t encode_rtype(uint8_t op, uint8_t rs, uint8_t rt, uint8_t rd, uint8_t shamt, uint8_t funct) {
    return (op << 26) | (rs << 21) | (rt << 16) | (rd << 11) | (shamt << 6) | funct;
}

// Decoding
void decode(uint32_t instr) {
    uint8_t opcode = (instr >> 26) & 0x3F;
    uint8_t rs     = (instr >> 21) & 0x1F;
    uint8_t rt     = (instr >> 16) & 0x1F;
    uint8_t rd     = (instr >> 11) & 0x1F;
    uint8_t shamt  = (instr >> 6)  & 0x1F;
    uint8_t funct  = instr & 0x3F;
    uint16_t imm   = instr & 0xFFFF;
    
    if (opcode == 0) {
        // R-type: use funct
        printf("R-type: rd=%d, rs=%d, rt=%d, funct=0x%X\n", rd, rs, rt, funct);
    } else {
        // I-type
        printf("I-type: opcode=0x%X, rs=%d, rt=%d, imm=%d\n", opcode, rs, rt, (int16_t)imm);
    }
}

Interview Questions

Explain the three MIPS instruction formats and their differences.

R-type (Register) uses three register operands and is used for arithmetic/logic. I-type (Immediate) uses two registers and a 16-bit immediate for ALU ops, loads/stores, and branches. J-type (Jump) uses a 26-bit target address for unconditional jumps. R-type has a funct field to select the exact ALU operation since opcode is 000000.

What are the tradeoffs between fixed-length and variable-length instructions?

Fixed-length (e.g., MIPS 32-bit) simplifies decoding and pipelining but wastes space on simple instructions. Variable-length (e.g., x86) improves code density but complicates decoding, often requiring microcode, and makes pipelining harder since instruction boundaries aren't known upfront.

How does the opcode field determine instruction behavior?

The opcode tells the CPU which operation to perform. In MIPS, the main opcode (bits 31-26) identifies the instruction type. For R-type, opcode is all zeros and the funct field selects the exact operation. For I-type, the opcode encodes both the operation and identifies it as immediate. J-type opcode identifies jump.

What is the shamt field used for in R-type instructions?

Shamt (shift amount, bits 10-6) specifies the shift amount for shift instructions like SLL (shift left logical) and SRL (shift right logical). For non-shift R-type instructions, shamt is set to 0 and ignored.