Inter-Task Communication (IPC) is the nervous system of an RTOS-based embedded system. Tasks do not exist in isolation; they cooperate, synchronize, exchange data, and coordinate timing. Without IPC, multitasking degenerates into isolated threads competing blindly for CPU time.

In embedded systems, IPC is not just about data exchange. It is about determinism, latency control, memory safety, and system integrity.

To understand IPC properly, we must begin with why it is required in the first place.

Why IPC Is Required

In a real embedded product, functionality is decomposed into independent tasks:

Sensor task reads ADC periodically
Communication task handles UART/Ethernet
Control task runs algorithm
Logging task stores data
UI task updates display

These tasks must cooperate.

Consider this simplified architecture:

+-----------------+
|   Sensor Task   |----+
+-----------------+    |
                       v
                +--------------+
                | Control Task |
                +--------------+
                       |
                       v
                +--------------+
                |  Actuator    |
                +--------------+

The sensor produces data.

The control algorithm consumes data.

The actuator task must wait for a control decision.

If tasks cannot exchange information safely:

Data corruption occurs
Race conditions appear
System becomes non-deterministic
Priority inversion may happen
Real-time guarantees are lost

Thus IPC is required for two core reasons:

• Data sharing

• Synchronization

Once we accept that tasks must share information, the next question becomes: _How do they share memory safely?_

Shared Memory

The most primitive IPC mechanism is shared memory.

Two tasks access the same memory region:

          RAM
   +------------------+
   |   Shared Buffer  |
   +------------------+
       ^          ^
       |          |
  Task A       Task B

Example:

volatile int sensor_value;

void SensorTask() {
    sensor_value = read_adc();
}

void ControlTask() {
    if(sensor_value > threshold) {
        trigger_alarm();
    }
}

This works — but it is dangerous.

Why?

Because access is not atomic. If:

Task A updates while Task B reads
An interrupt occurs mid-write
Compiler reorders operations

You get inconsistent data.

This leads us directly to the need for critical sections.

Critical Sections

A critical section is a protected region of code where shared resources are accessed.

In a single-core MCU, protection is often done by disabling interrupts:

enter_critical();
shared_variable++;
exit_critical();

Timeline:

Time →
Task A:  |---- critical ----|
Interrupt:        (blocked)
Task B:               (waiting)

The idea is simple:

• Only one execution context accesses the resource at a time

• Prevent preemption during modification

However, disabling interrupts has consequences:

Increased interrupt latency
Jitter
Reduced responsiveness

Critical sections must be:

• Very short

• Deterministic

• Non-blocking

But disabling interrupts is a low-level tool. For task-level synchronization, RTOS kernels provide structured primitives such as mutexes and semaphores.

Mutex vs Binary Semaphore

At first glance, a mutex and a binary semaphore look identical — both can hold values 0 or 1.

But their semantics differ.

Binary Semaphore

A binary semaphore is a signaling mechanism.

Example use case:

ISR signals a task
One task signals another

Semaphore = 0

ISR:
  give(semaphore)

Task:
  take(semaphore) → blocks until given

Binary semaphores are about event signaling, not ownership.

They do not track who owns them.

Mutex

A mutex is a mutual exclusion lock with ownership.

Mutex State:
   Unlocked
   Locked by Task X

Only the owning task can unlock it.

Important difference:

• Mutex supports priority inheritance

• Binary semaphore typically does not

Why Priority Inheritance Matters

Consider:

High Priority Task ----+
                       |
Medium Priority Task   |
                       v
Low Priority Task (holds mutex)

If low priority holds mutex and high priority waits for it,

but medium priority keeps running — high priority is starved.

This is priority inversion.

A proper mutex boosts the priority of the low-priority task temporarily.

This is why:

• Use mutex for protecting shared resources

• Use binary semaphore for signaling events

Once binary signaling is understood, we can extend it further with counting semaphores.

Counting Semaphores

A counting semaphore maintains an integer count > 1.

It is useful for:

• Resource pools

• Producer-consumer with limited buffer slots

• Tracking available units

Example: 5 identical buffers available.

Semaphore count = 5

Task:
  take() → decrement
  give() → increment

Representation:

Available Buffers:
[ ][ ][ ][ ][ ]
 ^  ^  ^
 taken by tasks

Counting semaphores allow multiple simultaneous holders — unlike mutex.

They are not ownership-based.

This leads naturally to higher-level synchronization structures such as event groups.

Event Groups

Sometimes tasks must wait for multiple conditions simultaneously.

Example:

Network ready
Sensor calibrated
Storage mounted

Instead of using multiple semaphores, an event group uses bit flags.

Event Register (8-bit example)

Bit 0 → Network Ready
Bit 1 → Sensor Ready
Bit 2 → Storage Ready

Task waits for: WAIT_FOR( bit0 AND bit1 )

View:

Current State: 0 1 1 0 0 0 0 0
               | | |
               | | +-- Storage Ready
               | +---- Sensor Ready
               +------ Network Ready

Event groups allow:

• Wait for ANY bit

• Wait for ALL bits

• Auto-clear bits

They are synchronization tools, not data transport tools.

For actual data passing, we move toward message-based mechanisms.

Message Queues

A message queue is a buffered communication channel.

Architecture:

        Queue (FIFO)
+--------------------------------+
| Msg1 | Msg2 | Msg3 |    ...    |
+--------------------------------+
    ^                        ^
 Producer               Consumer

Properties:

• FIFO order

• Fixed message size

• Blocking send/receive

• Safe between tasks and ISR (if supported)

Example:

struct Data {
   int value;
   int timestamp;
};

send(queue, &data);
receive(queue, &data);

Queues provide:

• Decoupling between producer and consumer

• Temporal isolation

• Flow control

If queue fills:

Sender blocks
Or drops message (depending on design)

Queues are excellent for structured data passing.

But sometimes we only need a single message slot.

Mailboxes

A mailbox is a single-slot message container.

Mailbox:
+-------------+
|  Pointer    |
+-------------+

It usually transfers a pointer to data rather than copying data.

Example:

Task A:
   ptr = &buffer;
   post(mailbox, ptr);

Task B:
   ptr = pend(mailbox);

Mailboxes are useful when:

• Only the latest message matters

• Overwriting old data is acceptable

• Memory copying must be avoided

They are lighter than queues.

But when continuous byte streams are needed — for example UART data — stream buffers are more appropriate.

Stream Buffers

Stream buffers are optimized for continuous data flow.

Unlike message queues:

• No fixed message boundaries

• Byte-oriented

• Circular buffer internally

Architecture:

Circular Buffer

   Head →
+-----------------------+
| A | B | C | D | E | F |
+-----------------------+
                ← Tail

Used for:

• UART RX/TX

• Audio samples

• Network stacks

Producer writes bytes.

Consumer reads bytes.

Blocking occurs if:

Buffer full (write blocks)
Buffer empty (read blocks)

Stream buffers are efficient but still involve memory copying.

To reduce overhead further, we move toward zero-copy communication.

Zero-Copy Communication

In high-performance embedded systems, copying data costs:

• CPU cycles

• Cache pollution

• Memory bandwidth

Zero-copy avoids copying payload data.

Instead of: Producer → copy → Queue → copy → Consumer

We do:

Producer → allocate buffer
          → send pointer
Consumer → process directly
          → free buffer

Architecture:

Memory Pool
+----+----+----+----+
| B1 | B2 | B3 | B4 |
+----+----+----+----+

Flow:

Producer takes buffer from pool
Fills data
Sends pointer via queue/mailbox
Consumer processes
Returns buffer to pool

This achieves:

• Minimal latency

• Deterministic timing

• Reduced memory footprint

• Better scalability

Zero-copy is common in:

Network stacks
DMA-based drivers
High-speed logging
Audio/video pipelines

But it requires disciplined memory ownership management.

IPC Architecture View

A mature embedded system combines multiple IPC mechanisms:

                +----------------+
                |   Event Group  |
                +----------------+
                       |
+--------+      +-------------+      +--------+
| Sensor |----->|   Queue     |----->|Control |
+--------+      +-------------+      +--------+
       |                |
       v                v
  Stream Buffer     Mailbox

Design rule in real systems:

• Use mutex for protection

• Use semaphore for signaling

• Use event groups for multi-condition sync

• Use queues for structured data

• Use stream buffers for byte streams

• Use zero-copy for high throughput

IPC Mechanisms Summary Table

Mechanism	Purpose	Data Transfer	Ownership	Blocking Support	Typical Use Case	Real-Time Notes
Shared Memory	Direct data sharing	Direct memory access	None	No (needs protection)	Fast data access	Must protect with critical section
Critical Section	Protect shared resource	N/A	N/A	No (short only)	Atomic variable update	Increases interrupt latency
Mutex	Mutual exclusion	No data	Yes (owner-based)	Yes	Protect peripheral, file system	Supports priority inheritance
Binary Semaphore	Event signaling	No data	No	Yes	ISR → Task notification	No ownership tracking
Counting Semaphore	Resource counting	No data	No	Yes	Buffer pool, resource pool	Multiple simultaneous holders
Event Group	Multi-condition sync	Bit flags	No	Yes	Wait for multiple events	Efficient multi-bit wait
Message Queue	Structured message passing	Copy-based	No	Yes	Producer–Consumer	Deterministic FIFO
Mailbox	Single message slot	Usually pointer	No	Yes	Latest-value transfer	Lightweight queue
Stream Buffer	Continuous byte stream	Copy-based	No	Yes	UART, audio data	Circular buffer based
mr!sM@sterZero-Copy	High-performance transfer	Pointer-based	Managed externally	Yes (via queue/semaphore)	Networking, DMA	Minimizes latency & CPU load

ES: Inter-Process Communication (IPC)

Why IPC Is Required

Shared Memory

Critical Sections

Mutex vs Binary Semaphore

Binary Semaphore

Mutex

Why Priority Inheritance Matters

Counting Semaphores

Event Groups

Message Queues

Mailboxes

Stream Buffers

Zero-Copy Communication

IPC Architecture View

IPC Mechanisms Summary Table