TECHNICAL BLOG

Deep Dives for Engineers

Detailed technical articles covering the real problems we solve in embedded systems, AI, and robotics engineering.

Linux Kernel Tuning for Resource-Constrained Embedded Systems
Embedded Systems

Linux Kernel Tuning for Resource-Constrained Embedded Systems

Worksprout Team Sep 03, 2024 10 min read

How to configure, trim, and optimise the Linux kernel for embedded targets — from Kconfig surgery to real-time scheduling and memory footprint reduction.

Start with the Right Baseline

Never begin kernel configuration from allyesconfig or a desktop distribution's config. Start from your board's vendor defconfig, or from tinyconfig and build upward. The kernel ships defconfigs for hundreds of boards under arch/arm64/configs/ and arch/arm/configs/. Use make ARCH=arm64 defconfig for a minimal ARM64 baseline, then run make menuconfig to add what your application actually needs.

Identifying What to Remove

The fastest way to find bloat is to boot a default kernel, run lsmod, and disable every module that is not loaded. Then audit /proc/filesystems, /proc/interrupts, and /proc/devices to understand which subsystems your workload actually touches. Common removals for headless embedded targets:

  • Sound (ALSA, ASoC) — unless audio is part of the product
  • Bluetooth stack — if unused
  • Unused filesystem types (btrfs, f2fs, exfat)
  • NFS client — unless you network-boot for development
  • PCMCIA, IEEE 1394 (FireWire), old ISA drivers
  • Staging drivers (CONFIG_STAGING)

Real-Time Scheduling with PREEMPT_RT

For applications with hard timing requirements — motor control, sensor fusion, protocol handling with strict deadlines — the mainline kernel's default scheduler introduces latency that is unacceptable. The PREEMPT_RT patch set (now being merged upstream in pieces) converts nearly all kernel spinlocks to sleeping mutexes and makes interrupt handlers preemptible, reducing worst-case latency from hundreds of microseconds to tens.

# Enable in Kconfig
CONFIG_PREEMPT_RT=y
CONFIG_HZ_1000=y
CONFIG_NO_HZ_FULL=y

Measure before and after with cyclictest:

cyclictest --mlockall --smp --priority=80 --interval=200 --distance=0 -l 100000

On a Raspberry Pi 4 with PREEMPT_RT, worst-case latency typically drops from ~1.5 ms to under 150 µs under load.

Memory Footprint Reduction

RAM is expensive on constrained targets. Enable CONFIG_CC_OPTIMIZE_FOR_SIZE to compile the kernel with -Os instead of -O2. Use CONFIG_SLOB or CONFIG_SLUB (SLUB is recommended) as the SLAB allocator — SLOB is smallest but lacks debugging support. For very tight targets consider CONFIG_BASE_SMALL=y.

Check your actual kernel size with:

size vmlinux
ls -lh arch/arm64/boot/Image

Reducing Boot Time via Kernel Parameters

Kernel command-line parameters that meaningfully reduce boot time on embedded targets:

  • quiet loglevel=0 — suppress console output (can save 50-100 ms on slow UARTs)
  • rootfstype=ext4 — avoids filesystem probing
  • fsck.mode=skip — skip fsck on read-only rootfs
  • initcall_debug — use during profiling to identify slow init calls, then remove

Combine with systemd-analyze blame (or the equivalent for your init system) to see where time goes after the kernel hands off to userspace.

Devicetree Overlays for Hardware Flexibility

For products with optional hardware peripherals, use devicetree overlays rather than building separate kernel images per SKU. Overlays are small DTB fragments loaded by the bootloader that patch the base device tree at runtime. U-Boot's fdtoverlay command applies them before booting:

load mmc 0:1 ${fdt_addr_r} base.dtb
load mmc 0:1 0x02000000 overlay-uart2.dtbo
fdt addr ${fdt_addr_r}
fdt resize 65536
fdt apply 0x02000000

Conclusion

Kernel tuning is an iterative discipline. Profile first, then remove or optimise specific subsystems. A well-tuned embedded kernel should boot in under two seconds from power-on to first userspace process, consume less than 12 MB of RAM at idle, and — if real-time is required — deliver deterministic scheduling latency. These goals are achievable on commodity hardware with the right configuration discipline.

Share

Worksprout Team

The Worksprout engineering team specialises in embedded Linux, RDK-B broadband platforms, edge AI, and robotics systems. Based in Rajshahi, Bangladesh, we design and deploy production embedded intelligence for clients across South Asia and beyond.

Related Posts

Continue reading — handpicked articles you might enjoy