LabWired Whitepaper
Whitepaper

The Ghost in the CI: why embedded tests flake and how to stop it

Priority inversion sank Mars Pathfinder, and the same class of timing bug is what flakes embedded CI today. A case for deterministic execution.

The ghost 150 million miles away

In 1997, the Mars Pathfinder started mysteriously rebooting on the surface of Mars. The culprit was not a broken component but a priority inversion in its scheduler.

HI PRIO (bc_dist) MED PRIO (comm) LOW PRIO (ASI/MET) Locks Mutex Preempts Low Prio WAITS FOR MUTEX (Blocked by Med Prio) WATCHDOG RESET

Simplified priority inversion: the medium-priority task runs indefinitely, accidentally blocking the high-priority task from ever getting the resource it needs.

A meteorological task (low priority) held a shared lock. A communications task (medium priority) preempted it and ran so long that the critical bus manager (high priority) could never start. The watchdog timer, sensing the stall, rebooted the entire system. Without deterministic execution, these timing-dependent bugs are almost impossible to catch before they reach their destination.

Why firmware flakes

Flakiness is often a symptom of race conditions — bugs that only appear when specific events happen in a precise, microsecond-aligned order. The Therac-25 tragedy is the ultimate example: a race condition in the control software allowed a high-energy beam to fire without a safety shield, only if the operator was "too fast" with their keyboard commands.

NORMAL OPERATION RACE WINDOW (< 8s) Fast Operator Input (Data Entry Complete)

The race condition: if the operator corrected a setting within 8 seconds, the software would sometimes fail to update the hardware state correctly, leading to an unshielded beam.

1. The "real-time" fallacy

Conventional testing relies on real-world time. If your CI runner doesn't happen to hit that exact microsecond window, the test passes. You get a green check, but the bug is still there. This is the real-time fallacy: thinking that testing in "real-time" is the same as testing all timing possibilities.

2. Non-deterministic jitter

Because host PCs have varying background loads, a test that passes at 10:00 AM might fail at 2:00 PM simply because of a background update. This jitter masks real bugs and turns CI into a game of chance. To catch a Therac-25 style bug, you need to control time, not just watch it.

The fix: separate simulated time from host time

To kill flakiness, we enforce a strict separation between simulated time and host time. The separation is a lesson from the 1991 Patriot Missile failure. A tiny rounding error in a 24-bit representation of time accumulated over 100 hours into a 0.34-second lag — enough to miss its target by half a kilometer.

Simulated Time (Ideal) System Time (Drifting) 0.34s DRIFT 0 HOURS 100 HOURS

The accumulation of error: in the Patriot system, a precision error of 0.0001% per step was negligible in a 1-minute test, but fatal after 100 hours of continuous operation.

LabWired solves this with a lockstep stepping mechanism. Instead of relying on a "real-time" clock that can drift or be interrupted by host OS noise, every instruction and hardware interrupt is synchronized to a fixed global simulation clock. There is zero drift because time only advances when the core steps.

For high-reliability targets, the engine also supports lockstep fault injection: a golden core and a shadow core run the same firmware step for step, and the full register state is compared after every instruction. Inject a bit flip or a skipped instruction into the shadow core, and the simulation halts at the exact instruction where its state diverges from the golden run — with a deterministic snapshot of the failure state.

The virtual replica: preventing the next ghost

In 1997, it took NASA three weeks of debugging on an exact terrestrial replica of the Pathfinder to find the priority inversion. In 2026, LabWired brings that "virtual replica" capability to every developer's local environment and CI pipeline.

Make your firmware CI deterministic →

Fighting flaky hardware tests in your pipeline? Write to contact@labwired.com or book 30 minutes.