ES: MCU Buses | Amr Tarek

Polling rate defines responsiveness.

If polling every 10 ms, worst case delay = 10 ms.

This wastes CPU cycles and power.

So hardware designers introduced interrupt logic.Without buses, the CPU would not fetch instructions, access variables, or control peripherals.

To understand advanced topics like DMA, memory mapping, performance tuning, or hardware security, we must first understand how buses work.

What is a Bus in a Microcontroller?

A bus is a set of parallel electrical lines used to transfer digital signals between components inside the microcontroller.

It carries three fundamental types of information:

Data
Address
Control signals

Conceptually, a bus is a digital highway.

Inside any MCU such as:

STM32F4
ATmega328P
ESP32
NXP LPC1768

the CPU does not connect directly to memory or peripherals with separate wires.

Instead, everything is connected through shared bus structures.

Basic internal topology:

            +----------------+
            |      CPU       |
            +----------------+
                   |
                   |  Internal Bus System
                   |
    +--------------+--------------+
    |              |              |
+---------+   +--------+     +-------------+
|  Flash  |   |  SRAM  |     | Peripherals |
+---------+   +--------+     +-------------+

The Three Fundamental MCU Buses

Every classical microcontroller architecture divides internal communication into three logical buses:

Data Bus
Address Bus
Control Bus

Although physically they may share wires (depending on architecture), logically they represent different signal groups.

Let us examine each one carefully.

Data Bus

The data bus transfers the actual binary information between system components.

Whenever:

CPU fetches an instruction from Flash
CPU reads a variable from SRAM
CPU writes to a GPIO register

data travels over the data bus.

Conceptual diagram:

CPU  <====== Data Bus ======>  Memory / Peripheral

Bus Width and Performance

The width of the data bus defines how many bits can be transferred per clock cycle.

Examples:

8-bit MCU → transfers 1 byte per cycle
32-bit MCU → transfers 4 bytes per cycle

Example comparison:

ATmega328P → 8-bit data path
STM32F4 → 32-bit data path

This directly impacts:

Arithmetic throughput
Memory bandwidth
Interrupt handling speed

Now that we understand how data moves, the next logical question becomes:

How does the CPU know where to read or write?

That is the role of the Address Bus.

Address Bus

The address bus selects the memory location that the CPU wants to access.

It does not carry data — it carries location identifiers.

If the address bus is N bits wide, the maximum addressable memory space is:

2^N locations

Example:

16-bit address bus:

2^16 = 65,536 addresses = 64 KB

Conceptual flow:

CPU ---- Address Bus ----> Memory
           (select location)

The address bus determines:

Maximum Flash size
Maximum SRAM size
Peripheral address range

This leads us to an important architectural concept:

Memory-mapped peripherals, which we will handle later in detail.

But selecting a location is not enough — the system must also know what operation to perform.

This brings us to the Control Bus.

Control Bus

The control bus carries signals that coordinate operations.

Typical signals include:

Read (RD)
Write (WR)
Clock (CLK)
Reset
Interrupt acknowledge

Diagram:

CPU ---[ RD | WR | CLK | INT ]---> Memory / Peripheral

Example operation cycle:

CPU places address on address bus
CPU asserts RD signal
Memory places data on data bus
CPU reads the data

Without control signals, the system would not know:

Whether to read or write
When to respond
When data is valid

Now that we understand the three bus types, we must see how they are organized depending on architecture style.

Bus Architecture: Harvard vs Von Neumann

The organization of buses depends on the memory architecture.

In Von Neumann architecture:

One shared bus is used for both instructions and data

CPU
  |
Shared Bus
  |
Memory (Code + Data)

In Harvard architecture:

Separate instruction bus
Separate data bus

            Instruction Bus
CPU  ------------------------> Flash

            Data Bus
CPU  ------------------------> SRAM

Many modern MCUs like STM32F4 use modified Harvard architecture.

Benefits:

Parallel instruction fetch + data access
Higher throughput
Better real-time determinism

Understanding this separation prepares us for more advanced bus systems used in ARM-based microcontrollers.

AMBA Bus Architecture in ARM Microcontrollers

Most modern ARM Cortex-M MCUs implement AMBA (Advanced Microcontroller Bus Architecture), designed by ARM Holdings.

AMBA introduces structured bus hierarchies such as:

AHB (Advanced High-performance Bus)
APB (Advanced Peripheral Bus)

Example in STM32F4:

                 +-----------+
                 | Cortex-M  |
                 +-----------+
                      |
                     AHB
                      |
        +-------------+-------------+
        |                           |
      Flash                        SRAM
        |
       APB
        |
   UART  SPI  I2C  GPIO  TIMERS

Design rationale:

AHB → High-speed memory and DMA
APB → Lower-speed peripherals
Reduced power consumption
Segmented performance domains

AHB – Advanced High-performance Bus

It is part of ARM’s AMBA bus architecture.

It connects:

CPU
High-speed memory
DMA
High-speed peripherals

Inside MCU, multiple blocks need to communicate.

Instead of direct wiring, a bus system is used.

AHB is the main highway inside the MCU.

			+--------+
			|  CPU   |
			+---+----+
				|
			   AHB
 +---------------+---------------+
 |               |               |
Flash            SRAM            DMA

AHB Characteristics:

High bandwidth
32-bit or 64-bit data width
Burst transfers supported
Multi-master capable

Now that we see multiple masters on the bus, an important question appears:

What happens when more than one unit wants the bus at the same time?

Bus Arbitration and Multi-Master Systems

Modern MCUs contain multiple bus masters:

CPU
DMA controller
Debug interface
Sometimes Ethernet or USB controllers

They all may request memory simultaneously.

The bus arbiter decides priority.

CPU ----\
          \
DMA ------> [ BUS ARBITER ] ----> SRAM
           /
Debug ----/

This affects:

Interrupt latency
Real-time determinism
DMA performance
System jitter

Understanding arbitration is critical when designing high-performance embedded systems.

APB – Advanced Peripheral Bus

It is a lower-speed bus used for:

UART
SPI
I2C
Timers
ADC
GPIO

APB is optimized for:

Simplicity
Low power
Low cost

		 CPU
		  |
		 AHB
		  |
	+-----+------+
	|            |
  SRAM         APB Bridge
				  |
				 APB
	  +-----------+-----------+
	  |           |           |
	UART         ADC         TIMER

Notice:

AHB -> APB Bridge -> APB peripherals

APB is not high-performance like AHB.

Why Two Buses?

Because peripherals don’t need high-speed access.

Using AHB for everything would:

Increase power
Increase complexity
Increase silicon area

So ARM separated:

AHB = Fast data path
APB = Control path

		+-------------------+
		|        CPU        |
		| +---------------+ |
		| |      MPU      | |
		| +---------------+ |
		+--------+----------+
				 |
				AHB
 +---------------+----------------+
 |                                |
Flash                             SRAM
									 
				 |
			AHB/APB Bridge
				 |
				APB
  +--------------+-------------+
  |              |             |
UART           SPI           TIMER

MPU protects:

Flash
SRAM
Peripheral regions

AHB handles:

High-speed data transfers

APB handles:

Peripheral control registers

Internal Buses vs External Communication Buses

So far we discussed internal buses.

But MCUs also connect to the outside world using:

SPI
I2C
UART
CAN
USB

Important distinction:

Internal bus → connects CPU to SPI module

SPI bus → connects SPI module to external sensor

Flow example:

CPU → AHB → APB → SPI Peripheral → SPI External Bus → Sensor

Understanding this separation prepares us for peripheral driver design.