ES: RTOS Core Concepts

An RTOS (Real-Time Operating System) is an operating system designed to provide deterministic task scheduling and guaranteed response times to meet defined deadlines.

The Meaning of Real-Time

Before we define an RTOS, we must define real-time correctly.

Real-time does not mean fast.

Real-time means:

The system must respond within a defined time constraint.

Correctness = logical correctness + timing correctness.

If timing is wrong → system is wrong.

Example:

txt
Sensor Trigger
     |
     v
Control Algorithm
     |
     v
Actuator Output (must happen before deadline)

If actuator update is late → instability, damage, or safety risk.

This immediately leads us to classify real-time systems.


Types of Real-Time Systems

Real-time systems are categorized by what happens when a deadline is missed.

Hard Real-Time

Missing a deadline = system failure.

Examples:

  • Airbag controller
  • ABS braking system
  • Pacemaker
  • Flight control
  • Industrial safety system

Timing model:

txt
Time ---------------------------------------------------->

Event        |----- Deadline -----|
Execution    |-----------X--------|

X after deadline → FAILURE

Engineering characteristics:

  • Deterministic scheduling
  • Bounded worst-case execution time (WCET)
  • Minimal jitter
  • Static memory preferred
  • Strict interrupt latency control

Firm Real-Time

If deadline missed → result useless

But system continues running.

Example:

  • Control loop sampling
  • Radar data frame
  • Sensor fusion cycle

txt
Time ---------------------------------------------------->

Event        |----- Deadline -----|
Execution    |-----------X--------|

Late result discarded, system continues

Soft Real-Time

Deadline miss degrades performance.

Examples:

  • Audio streaming
  • Video playback
  • BLE wearable metrics
  • Smart home automation

txt
Time ---------------------------------------------------->

Event        |----- Deadline -----|
Execution    |-----------X--------|

Late → glitch, system survives

Why Real-Time Type Matters

Real-time classification defines:

  • Hardware choice
  • Kernel design
  • Scheduler type
  • Memory strategy
  • Power model
  • Certification requirements

Architecture flow:

txt
Application Requirement
        |
        v
Real-Time Type (Hard / Firm / Soft)
        |
        v
Scheduler Strategy
        |
        v
Kernel Configuration
        |
        v
Hardware Selection

What is an RTOS?

An RTOS (Real-Time Operating System) is an operating system designed to guarantee detinistic timing behavior.

Not “fast”.

Not “high performance”.

But predictable.

  • Provide deterministic task scheduling
  • Guarantee bounded interrupt latency
  • Manage concurrency predictably
  • Meet real-time deadlines

In embedded systems, correctness is not only about _what_ happens — but also _when_ it happens.

Example:

  • Airbag deployment → must happen within milliseconds.
  • Motor control PWM update → must happen at precise intervals.
  • Communication protocol timeout → must expire exactly when expected.

If timing is wrong, the system is wrong.

This leads to the core definition:

An RTOS is an operating system designed to manage tasks with deterministic scheduling and bounded latency.

It sits between application and hardware:

txt
Application Tasks
     |
     v
+-------------------+
|      RTOS         |
|  Scheduler        |
|  IPC              |
|  Timing           |
+-------------------+
     |
     v
Hardware (CPU, Timers, Peripherals)

The RTOS sits between application and hardware, controlling execution timing.

But to understand why RTOS exists, we must compare it with normal operating systems.


RTOS vs General Purpose OS (Linux, Windows)

General Purpose Operating Systems (GPOS) like Linux and Windows are designed for:

  • Throughput
  • Fairness
  • User experience
  • Multi-user systems
  • Resource sharing

RTOS is designed for:

  • Deterministic latency
  • Deadline guarantees
  • Embedded control
  • Low memory footprint

Let’s compare behavior.

General Purpose OS Behavior

txt
Task A ----\
Task B -------> Scheduler ---> CPU
Task C ----/

GPOS goals:

  • Maximize CPU utilization
  • Optimize average performance
  • May delay tasks unpredictably

You might get:

  • Interrupt latency: unpredictable
  • Scheduling latency: variable
  • Cache effects: unpredictable

Fine for desktop.

Dangerous for motor control.


RTOS Behavior

txt
High Priority Task  ---> immediate execution
Medium Task
Low Task

Rules:

  • Higher priority always wins
  • Preemption guaranteed
  • Latency bounded

Key difference:

AspectGPOSRTOS
GoalThroughputDeterminism
LatencyBest effortBounded
MemoryLargeSmall
SchedulerComplexDeterministic

Now that we understand _why_ RTOS exists, we need to understand _how it is built_.

That leads us to the kernel architecture.


Kernel Architecture Overview

The kernel is the core of the RTOS.

Typical RTOS kernel components:

txt
+----------------------------------+
| Scheduler                        |
|----------------------------------|
| Task Management                  |
|----------------------------------|
| IPC (Queue, Semaphore, Mutex)    |
|----------------------------------|
| Time Management                  |
|----------------------------------|
| Interrupt Handling               |
+----------------------------------+

Let’s break down flow:

txt
Interrupt Occurs
      |
      v
ISR Executes
      |
      v
RTOS may trigger scheduler
      |
      v
Higher priority task runs

Unlike Linux, RTOS kernels are usually:

  • Monolithic but small
  • Static memory preferred
  • No dynamic process isolation
  • Often run entirely in privileged mode

The heart of this architecture is the scheduler.

But before designing scheduler behavior, we must understand scheduling types.


Preemptive vs Cooperative Scheduling

This is fundamental.

Cooperative Scheduling

Tasks voluntarily give up CPU.

txt
Task A running
   |
   | (calls yield)
   v
Task B runs

If Task A never yields → system blocked.

Pros:

  • Simple
  • Low overhead

Cons:

  • One bad task can block system
  • Not suitable for hard real-time

Preemptive Scheduling

Higher priority task can interrupt lower priority task immediately.

txt
Low Priority Task running
        |
        | Interrupt occurs
        v
High Priority Task preempts

Preemption is typically triggered by:

  • Timer tick
  • Interrupt event
  • Unblock of higher priority task

Preemption ensures deterministic response.

This leads to a fundamental question:

What exactly is a task in RTOS?


Task / Thread Model

In RTOS, the main execution unit is the task (thread).

Each task has:

  • Stack
  • Priority
  • State
  • Context (registers)
  • Entry function

Memory structure:

txt
RAM
---------------------------------
| Task A Stack                 |
---------------------------------
| Task B Stack                 |
---------------------------------
| Task C Stack                 |
---------------------------------
| Heap / Static Data           |
---------------------------------

Task states typically:

txt
READY
RUNNING
BLOCKED
SUSPENDED

txt
        +---------+
        | READY   |
        +----+----+
             |
             v
         +---+---+
         |RUNNING|
         +---+---+
             |
   +---------+---------+
   |                   |
   v                   v
BLOCKED            READY (preempted)

When a task blocks (e.g., waiting on semaphore), scheduler picks next ready task.

But how does CPU switch between tasks?

This is where context switching happens.


Context Switching Internals

Context switching is the process of:

  • Saving current task CPU registers
  • Restoring another task's registers
  • Switching stack pointer

Let’s look at what CPU state includes:

txt
General Purpose Registers
Program Counter (PC)
Stack Pointer (SP)
Status Register

When switching:

txt
Low Task Running
    |
    | Save registers to its stack
    v
Load High Task registers
    |
    v
High Task continues

Stack during switch:

txt
Task A Stack
------------------
| R0              |
| R1              |
| R2              |
| ...             |
| PC              |
| PSR             |
------------------

Important:

Context switch is usually triggered inside:

  • SysTick handler
  • PendSV (ARM Cortex-M)
  • Software interrupt

On Cortex-M:

txt
Interrupt
   |
   v
PendSV
   |
   v
Context Switch

Why PendSV?

Because it has lowest priority → ensures safe switching.

Context switch cost matters:

  • More registers = more time
  • FPU enabled = larger context

This brings us naturally to scheduler design.


Scheduler Design Fundamentals

The scheduler decides:

Which READY task runs next?

Common scheduler types:

Priority-Based Scheduler

Most common in RTOS.

Algorithm:

  • Always run highest priority READY task.

Data structure often:

txt
Priority Bitmap
or
Array of Ready Lists

Example:

txt
Priority 3: Task A
Priority 2: Task B
Priority 1: Task C

Scheduler scans from highest priority down.

Optimization trick:

txt
Bitmask = 0b01011000
CLZ instruction finds highest bit

O(1) scheduling.

Round Robin Inside Same Priority

If multiple tasks share priority:

txt
Task A (P2)
Task B (P2)
Task C (P2)

They rotate: A -> B -> C -> A ...

This prevents starvation at same priority level.

Now the key question:

When does scheduler run?

That leads us to system tick.


Tick-Based vs Tickless Kernel

RTOS needs time base.

Classic method: periodic timer interrupt.

Tick-Based Kernel

A hardware timer generates interrupt every fixed interval:

Example: 1ms tick.

txt
Timer Interrupt (every 1ms)
      |
      v
Increment tick count
      |
      v
Check delayed tasks
      |
      v
Possibly run scheduler

Advantages:

  • Simple
  • Predictable timing

Disadvantages:

  • Wakes CPU even if idle
  • Power consumption higher

Tickless Kernel

Instead of fixed tick:

  • System programs next interrupt dynamically
  • Sleeps until next scheduled event

txt
No fixed 1ms interrupt
Instead:
Sleep until next deadline

Flow:

txt
All tasks blocked?
   |
   v
Find nearest wake-up time
   |
   v
Program hardware timer
   |
   v
Enter low power

Better for:

  • Battery devices
  • Wearables
  • IoT sensors

But:

  • More complex
  • Timer reprogramming overhead

Bringing It All Together

Now let’s connect everything in one execution flow.

Example: UART data arrives.

txt
UART Interrupt
      |
      v
ISR reads byte
      |
      v
Gives semaphore to high priority task
      |
      v
Scheduler triggered
      |
      v
Context switch
      |
      v
UART Processing Task runs

That flow involves:

  • Kernel architecture
  • Interrupt handling
  • Preemptive scheduling
  • Task model
  • Context switching
  • Scheduler logic
  • Tick or tickless timing

Everything is connected.