This guide covers the automated test infrastructure. For build and run instructions see Building & Running.
./run.sh iso test)The single command for complete ISO CI validation:
./run.sh iso test
Runs two phases; build steps use Docker, QEMU/GDB prefer the host if available:
Phase 1 - in-kernel ktest suite
Boots makar-test.iso (single grub menuentry with timeout=0, multiboot2 /boot/makar.kernel test_mode). The kernel parses the multiboot2 cmdline, runs ktest_run_all() (all subsystem unit tests including a live ring-3 userspace execution), then exits QEMU cleanly via isa-debug-exit. Output: ktest.log.
makar-test.iso and makar.iso are emitted from the same kernel binary; there is no test-only build flag.
Phase 2 - GDB boot-checkpoint tests
Builds a normal debug ISO. Creates a 32 MiB FAT32 test disk and attaches it
on IDE:0 alongside the CD-ROM so the kernel can auto-mount it at /mnt/root. Launches QEMU
with the GDB stub and runs tests/gdb_boot_test.py. Output: gdb-test.log.
Exit code 0 = everything passed; 1 = any failure or timeout.
./run.sh hdd test)Verifies the installed HDD boot path end-to-end - no CD-ROM attached:
./run.sh hdd test
What it does:
src/kernel/makar.kernel (GDB symbol file) matches the binary written into the image.makar-hdd-test.img - fresh raw MBR + FAT32 + GRUB 2 image using the interactive kernel so shell_run is called and vfs_auto_mount() runs. Kept separate from makar-hdd.img so interactive and test images never share state.-boot c (HDD-only) and runs tests/gdb_hdd_test.py.Output files: hdd-test-gdb.log, hdd-test-serial.log.
Both gdb_boot_test.py (ISO boot) and gdb_hdd_test.py (HDD boot) run the
same four groups, providing equivalent external verification regardless of
boot medium:
| Group | What it verifies |
|---|---|
boot_checkpoints |
Every major boot function reached in order: kernel_main → terminal_initialize → … → shell_run |
hardware_state |
CR0.PG set (paging enabled), CR3 non-zero (page directory loaded), timer_callback fires (PIT ticking) |
vesa |
VESA framebuffer active and TTY initialised (or absent without crashing - graceful headless) |
hdd_mount |
fat32_mounted() non-zero - HDD rootfs auto-mounted (single-partition → /mnt/root; dual-partition → /mnt/boot + /mnt/root) after shell_run |
root_home |
(HDD only) vfs_file_exists("/root") returns 1, confirming vfs_ensure_root_home() mkdir’d it on the writable rootfs |
The hdd_mount check advances execution to keyboard_getchar (the shell’s
read-loop entry) before inspecting fat32_mounted(), ensuring
vfs_auto_mount() has fully completed.
The ISO GDB test creates the FAT32 test disk using mkfs.fat --offset
(sector-based, no losetup / --privileged needed), which works inside the
GitHub Actions container job.
To add a new group: create tests/groups/<name>.py exposing NAME and
run() → bool, then import it into both gdb_boot_test.py and
gdb_hdd_test.py.
# Build a debug ISO first
./run.sh iso boot # or: CFLAGS='-O0 -g3' ./run.sh iso release
# In one terminal - start QEMU with GDB stub (inside Docker)
docker run --rm -it -v "$PWD:/work" -w /work arawn780/gcc-cross-i686-elf:fast \
bash -lc 'qemu-system-i386 -cdrom makar.iso -s -S -display none -serial stdio'
# In another terminal - attach GDB (inside Docker or native)
docker run --rm -it -v "$PWD:/work" -w /work arawn780/gcc-cross-i686-elf:fast \
gdb-multiarch src/kernel/makar.kernel \
-ex "target remote :1234" \
-ex "break kernel_main" \
-ex "continue"
The -O0 -g3 flags ensure DWARF debug info is accurate. QEMU starts with
-S (freeze at reset), giving you time to set breakpoints before execution
begins.
From the kernel shell:
ktest
Runs ktest_run_all() and prints pass/fail for each subsystem suite
(PMM, heap, ring-3 execution, etc.) directly to the terminal and serial log.
./run.sh ui)Drives the running kernel through QEMU’s HMP sendkey and asserts on
the COM1 serial slice + PPM screen dump. Scenarios live in
tests/ui_test.sh as test_<name> shell functions; the shared runner
in tests/ui_runner.sh boots one QEMU per invocation and rotates
through scenarios with reset_shell in between.
The shell emits [shell:ready vt=N] to serial on every
shell_readline entry (gated by g_serial_verbose, which
start_qemu flips on as the first action after boot). Test
scenarios use this marker as a sync point instead of fixed sleep N:
| Primitive | What it does |
|---|---|
it <name> <script> [wait_secs] |
Fixed-sleep style. Send keys, sleep, snapshot serial slice. |
it_until <name> <script> <regex> [timeout] |
Sync-on-marker style. Send keys, poll the serial log until the regex appears, then snapshot. Default sync regex is '\[shell:ready vt=0\]'. |
wait_for_serial <regex> <start_bytes> [timeout] |
Low-level primitive. Polls $SERIAL_LOG from <start_bytes> until <regex> matches. Used by it_until and directly by multi-stage scenarios that need to sync between key batches (the typo-doesnt-clear scenario is the canonical multi-stage example: type the typo, wait for shell-ready, then type the next command, otherwise the in-flight bytes race the dying exec’d child and get dropped when keyboard_release_task reaps its slot). |
This is the expect/pexpect pattern, scaled down. Why it matters: a
fixed sleep 1.2 is a guess that’s right under KVM and wrong under
TCG-on-a-shared-runner. A marker sync is right in both – it returns
the instant the kernel says it’s ready and bounds via the timeout.
| Scenario | Asserts |
|---|---|
glob-proc |
cat /proc/* glob-expands across the synthetic FS |
tab-complete-path |
cat<TAB> /proc/c<TAB><Enter> resolves to cat /proc/cpuinfo |
exec-hello |
exec /apps/hello.elf tester reaches sys_exit(0) and prints the expected greeting |
cd-root-listing |
cd /<TAB><TAB> lists mounts; subsequent pwd confirms cwd |
per-tty-cwd |
Per-task cwd isolation across Alt+F1/Alt+F3 switches |
calc-brackets |
calc.elf evaluates parenthesised arithmetic |
ctrlc-kills-child |
Ctrl+C delivers SIGINT to a running ELF, shell recovers |
no-dead-in-proctasks |
cat /proc/tasks never lists DEAD slots |
typo-doesnt-clear |
A wrong command falls back to makbox, prints an error, and does not wipe the screen (proves the task_t.fb_touched gate) |
user-sigusr1-handler |
sigtest.elf installs a SIGUSR1 handler, self-sends, the handler runs in ring 3 (proves the sigframe + trampoline + SYS_SIGRETURN path) |
makbox-pwd |
pwd resolves to the makbox applet end-to-end |
./run.sh ui runs all of them; ./run.sh ui <name> runs
one; ./run.sh ui graphical runs with a visible QEMU window for
debugging.
incore.sh)For scenarios that just need to “run a binary, check it exited 0,”
the per-test HMP round-trip is overhead. src/userspace/incore.sh
is a kernel-sh script that drives those tests directly inside the
guest via exec + $?: each test invokes its ELF and the script
branches on the child’s SYS_EXIT value (low 8 bits) surfaced as
$? by shell_last_exec_status() in shell_cmd_apps.c. Final
marker INCORE: ALL PASS (or INCORE: FAIL) is the one substring
the runner asserts on. Fronted by the single HMP scenario
test_incore (./run.sh ui incore); see src/userspace/incore.sh
for the current test list (hello, forktest, execvetest, alloctest).
Trade-off vs HMP scenarios: faster (no per-test typing/settle, no
reset_shell, no reaper-output races), and the test list is editable
in a .sh file without touching the runner – but no screendump
evidence on panic, so this is only the right shape for tests that
don’t depend on framebuffer state. Interactive features (TAB,
Ctrl-C, VT switching, sh.elf readline, fullscreen apps) stay in
HMP-driven scenarios where the keyboard event is itself under test.
Each ELF in incore.sh is expected to print [name] PASS /
[name] FAIL: <reason> on its own and exit 0 / non-zero; the
script just aggregates. See alloctest.c for the canonical shape
(12 sub-tests, each emitting a status line, return 0/return 1).
.github/workflows/build-test.yml runs a build-once, fan-out topology on every push to main and every PR. Docs-only changes are skipped via path filter.
| Job | Runner | What runs |
|---|---|---|
build |
ubuntu-latest (host, Docker available) |
./run.sh iso build + ./run.sh hdd build → uploads makar.kernel, makar.iso, makar-test.iso, makar-hdd-test.img as artifact makar-build |
ktest |
ubuntu-latest + container arawn780/gcc-cross-i686-elf:fast |
downloads artifact → ./run.sh ktest |
gdb-iso |
ubuntu-latest + container |
downloads artifact → ./run.sh gdb iso |
gdb-hdd |
ubuntu-latest + container |
downloads artifact → ./run.sh gdb hdd |
Why this shape:
build job runs on the host (not in a container) because generate-hdd.sh spawns its own privileged Docker container for loop-device work, which is awkward to nest.qemu-system-i386 and gdb-multiarch without further setup.run.sh _qemu_accel); software breakpoints under the GDB stub never catch on KVM, and ktest hit a path-fault masked by KVM’s CPU timing.actions/cache@v4 with the cascade ccache-$-$ → ccache-$-$- → ccache-$-main- → ccache-$-. Warm rebuilds hit ~47 % cache.The release.yml workflow gates artifact publication on build-test.yml succeeding via workflow_call.