đ Direct Memory Access (DMA)
DMA is a hardware mechanism that allows peripheral devices to transfer data directly to and from main memory without involving the CPU, dramatically improving system throughput.
What is DMA?
Definition
CPU Overhead Reduction
Without DMA, CPU copies every byte between device and memory. With DMA, CPU only sets up the transfer.
Parallel Operation
While DMA transfers data, CPU can continue executing instructions, enabling true parallelism.
DMA Controller Functionality
DMA Controller Components
Source Address Register
Points to the memory location where data is read from (for memory-to-I/O transfers).
Destination Address Register
Points to the memory location where data will be written (for I/O-to-memory transfers).
Transfer Count Register
Number of bytes/words to transfer. Decremented after each transfer. Reaches zero = transfer complete.
Control Register
Configures direction (read/write), transfer mode (burst/cycle steal), and enables interrupts on completion.
DMA Transfer Modes
Burst Mode
DMA controller takes control of the bus and transfers the entire block without releasing it. Fast but blocks the CPU for the entire duration.
Cycle Stealing
DMA transfers one byte/word at a time, then releases the bus. Alternates bus access between CPU and DMA.
Transparent Mode
DMA transfers only when CPU is not using the bus (e.g., during CPU internal operations). No CPU slowdown at all.
DMA vs Programmed I/O Performance
| Metric | Programmed I/O | DMA (Cycle Steal) | DMA (Burst) |
|---|---|---|---|
| CPU cycles per byte | 10-100 | 1-2 (setup amortized) | 0 (total block) |
| CPU utilization during 1 MB transfer | 100% | ~5% | 0% |
| Transfer time (1 MB, 100 MHz bus) | ~10 ms | ~0.5 ms | ~0.1 ms |
| System throughput impact | Severe | Minimal | Moderate |
| Best for | Byte-at-a-time devices | Medium-speed, always-on | High-speed, block devices |
DMA Channels and Arbitration
DMA Channels
Daisy Chaining
Devices are connected in series. Bus grant passes from one device to the next. Simple but can be unfair to devices farther from the CPU.
Independent Request
Each device has dedicated request/grant lines to the arbiter. More complex but allows programmable priority.
Bus Mastering
What is Bus Mastering?
CPU vs DMA Data Path
Code Example: DMA Transfer Concept
c
#include <stdint.h>
// DMA controller registers (memory-mapped)
#define DMA_SRC ((volatile uint32_t*)0xFFFFF000)
#define DMA_DST ((volatile uint32_t*)0xFFFFF004)
#define DMA_CNT ((volatile uint32_t*)0xFFFFF008)
#define DMA_CTRL ((volatile uint32_t*)0xFFFFF00C)
#define DMA_STATUS ((volatile uint32_t*)0xFFFFF010)
#define DMA_START 0x1
#define DMA_READ 0x2 // Device â Memory
#define DMA_WRITE 0x4 // Memory â Device
#define DMA_IRQ_EN 0x8
// DMA transfer setup
void dma_transfer(void *src, void *dst, uint32_t count, int dir) {
*DMA_SRC = (uint32_t)src;
*DMA_DST = (uint32_t)dst;
*DMA_CNT = count;
*DMA_CTRL = dir | DMA_IRQ_EN | DMA_START;
}
// Without DMA: CPU copies every byte (slow)
void programmed_io_transfer(uint8_t *buf, uint32_t count) {
for (uint32_t i = 0; i < count; i++) {
while (!(inb(DEV_STATUS) & READY)); // Poll
buf[i] = inb(DEV_DATA); // Read byte
// CPU stalls on every byte â terrible for large transfers
}
}DMA Transfer Flow
Real-World DMA Example
Interview Questions
What is the difference between DMA and programmed I/O?
In programmed I/O, the CPU is involved in every byte transferred â it reads from the device and writes to memory. In DMA, a dedicated controller handles transfers directly between device and memory. DMA requires setup overhead but transfers each byte without CPU involvement, making it vastly more efficient for large or high-speed transfers.
Explain the three DMA transfer modes with examples.
1) Burst mode: DMA transfers entire block without releasing the bus â example: disk controller reading a full sector. 2) Cycle stealing: DMA transfers one word then releases the bus â example: sound card continuously streaming audio. 3) Transparent mode: DMA transfers only when CPU is using internal buses â example: background memory refresh.
What happens during a DMA cycle steal?
The DMA controller requests bus access via HOLD signal. The CPU finishes the current bus cycle, asserts HLDA (Hold Acknowledge), and tri-states its bus lines. The DMA controller performs one bus transfer (read device â write memory, or vice versa), then deasserts HOLD. The CPU resumes. This steals one bus cycle from the CPU.
How does scatter-gather DMA work?
Scatter-gather DMA uses a list of buffer descriptors (source, destination, length tuples) in memory. The DMA controller processes each descriptor sequentially, automatically chaining transfers across multiple non-contiguous buffers. This eliminates the need for the CPU to copy data into a single contiguous buffer. Widely used in network adapters and storage controllers.