CampusFlow
ArchitectureInstruction Formats

📋 Instruction Formats

How instructions are encoded: field layouts, opcodes, operands, and the difference between fixed and variable-length encodings.

What are Instruction Formats?

â„šī¸

Binary Encoding of Operations

An instruction format defines how the CPU interprets the bits of an instruction. Each format divides the instruction into fields: opcode (what to do), operands (who to do it to/with), and optional modifiers. The CPU decodes these fields to generate control signals.

MIPS uses three primary formats, each 32 bits wide. The opcode field (bits 31-26) identifies the format and operation. R-type uses a secondary funct field when opcode is 000000.

MIPS Instruction Formats

Field Layout

opcode (6) | rs (5) | rt (5) | rd (5) | shamt (5) | funct (6)

Example

ADD $1, $2, $3 → 000000 00010 00011 00001 00000 100000

Width: 32-bitUse: Arithmetic & logical operations

R-type Instruction Bit Layout

opcode
rs
rt
rd
shamt
funct
31-26
25-21
20-16
15-11
10-6
5-0
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits

Opcode and Funct Tables

Common Opcodes

000000ADD/SUB/AND/OR (R-type)ALU operations, funct field selects exact operation
001000ADDIAdd immediate
100011LWLoad word from memory
101011SWStore word to memory
000100BEQBranch if equal
000010JJump to address

Common Funct Codes (R-type)

100000ADDAdd rd = rs + rt
100010SUBSubtract rd = rs - rt
100100ANDBitwise AND
100101ORBitwise OR
101010SLTSet on less than

Fixed vs Variable Length Instructions

FeatureFixed-LengthVariable-Length
Instruction LengthFixed (32-bit)1-15 bytes
Encoding ComplexitySimple, uniformComplex, variable
Decoding HardwareSimple decoderComplex microcode decoder
Pipeline ImpactEasy to pipelineHarder to pipeline
Code DensityLower densityHigher density
Example ArchitecturesMIPS, ARM, RISC-Vx86, 68000
💡

Modern Compromise

ARM's Thumb-2 and RISC-V's compressed instruction extension (RVC) offer a compromise: they use 16-bit encodings for common instructions alongside 32-bit instructions, improving code density while keeping decoding relatively simple.

Encoding & Decoding Example

c

// Encoding: ADD $9, $10, $11
// R-type: opcode(000000) rs(01010) rt(01011) rd(01001) shamt(00000) funct(100000)
// Binary: 000000 01010 01011 01001 00000 100000
// Hex: 0x014B4820

uint32_t encode_rtype(uint8_t op, uint8_t rs, uint8_t rt, uint8_t rd, uint8_t shamt, uint8_t funct) {
    return (op << 26) | (rs << 21) | (rt << 16) | (rd << 11) | (shamt << 6) | funct;
}

// Decoding
void decode(uint32_t instr) {
    uint8_t opcode = (instr >> 26) & 0x3F;
    uint8_t rs     = (instr >> 21) & 0x1F;
    uint8_t rt     = (instr >> 16) & 0x1F;
    uint8_t rd     = (instr >> 11) & 0x1F;
    uint8_t shamt  = (instr >> 6)  & 0x1F;
    uint8_t funct  = instr & 0x3F;
    uint16_t imm   = instr & 0xFFFF;
    
    if (opcode == 0) {
        // R-type: use funct
        printf("R-type: rd=%d, rs=%d, rt=%d, funct=0x%X\n", rd, rs, rt, funct);
    } else {
        // I-type
        printf("I-type: opcode=0x%X, rs=%d, rt=%d, imm=%d\n", opcode, rs, rt, (int16_t)imm);
    }
}

Interview Questions

Explain the three MIPS instruction formats and their differences.

R-type (Register) uses three register operands and is used for arithmetic/logic. I-type (Immediate) uses two registers and a 16-bit immediate for ALU ops, loads/stores, and branches. J-type (Jump) uses a 26-bit target address for unconditional jumps. R-type has a funct field to select the exact ALU operation since opcode is 000000.

What are the tradeoffs between fixed-length and variable-length instructions?

Fixed-length (e.g., MIPS 32-bit) simplifies decoding and pipelining but wastes space on simple instructions. Variable-length (e.g., x86) improves code density but complicates decoding, often requiring microcode, and makes pipelining harder since instruction boundaries aren't known upfront.

How does the opcode field determine instruction behavior?

The opcode tells the CPU which operation to perform. In MIPS, the main opcode (bits 31-26) identifies the instruction type. For R-type, opcode is all zeros and the funct field selects the exact operation. For I-type, the opcode encodes both the operation and identifies it as immediate. J-type opcode identifies jump.

What is the shamt field used for in R-type instructions?

Shamt (shift amount, bits 10-6) specifies the shift amount for shift instructions like SLL (shift left logical) and SRL (shift right logical). For non-shift R-type instructions, shamt is set to 0 and ignored.