TECHNICAL BLOG
Deep Dives for Engineers
Detailed technical articles covering the real problems we solve in embedded systems, AI, and robotics engineering.
Detailed technical articles covering the real problems we solve in embedded systems, AI, and robotics engineering.
How to set up, configure, and run ROS 2 on resource-constrained embedded Linux targets — from cross-compilation and DDS tuning to deploying nodes on Raspberry Pi and Jetson hardware.
ROS 2 (Robot Operating System 2) is the successor to ROS 1, rebuilt from the ground up with embedded and safety-critical deployment in mind. Its key improvements — a DDS-based communications layer, real-time executor support, no dependency on a central ROS master process, and first-class support for ARM architectures — make it viable for deployment directly on embedded Linux targets rather than being confined to developer workstations.
ROS 2 distributions follow Ubuntu LTS cycles. For embedded deployments, prefer an LTS distribution with long-term support: Humble Hawksbill (supported through 2027) is the current recommendation. Its Tier 1 platform support includes Ubuntu 22.04 on ARM64, which aligns with Raspberry Pi OS 64-bit and Jetson's Ubuntu-based JetPack.
# On Raspberry Pi OS (64-bit, Bookworm)
sudo apt install software-properties-common
sudo add-apt-repository universe
sudo curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.key -o /usr/share/keyrings/ros-archive-keyring.gpg
echo "deb [arch=arm64 signed-by=/usr/share/keyrings/ros-archive-keyring.gpg] http://packages.ros.org/ros2/ubuntu jammy main" | sudo tee /etc/apt/sources.list.d/ros2.list
sudo apt update
sudo apt install ros-humble-ros-base python3-colcon-common-extensions
The ros-base metapackage installs the core DDS transport, rclpy/rclcpp, and command-line tools without desktop GUI tools — appropriate for headless embedded targets.
import rclpy
from rclpy.node import Node
from sensor_msgs.msg import Temperature
class TemperatureSensor(Node):
def __init__(self):
super().__init__("temperature_sensor")
self.publisher = self.create_publisher(Temperature, "sensor/temperature", 10)
self.timer = self.create_timer(1.0, self.publish_reading)
self.get_logger().info("Temperature sensor node started")
def publish_reading(self):
msg = Temperature()
msg.header.stamp = self.get_clock().now().to_msg()
msg.temperature = self.read_sensor() # Your hardware read call
msg.variance = 0.1
self.publisher.publish(msg)
def main():
rclpy.init()
node = TemperatureSensor()
rclpy.spin(node)
node.destroy_node()
rclpy.shutdown()
ROS 2's default DDS middleware (eProsima Fast DDS) is configured for a local network with multiple machines. On a single-board embedded target, tune it for minimal overhead:
export FASTRTPS_DEFAULT_PROFILES_FILE=/opt/ros/humble/share/fastrtps/examples/cpp/dds/HelloWorldExample/MIN_FOOTPRINT_PROFILE.xml
export RMW_FASTRTPS_USE_QOS_FROM_XML=1
Consider switching to CycloneDDS (ros-humble-rmw-cyclonedds-cpp) for lower memory footprint on embedded targets. CycloneDDS consistently uses 30-40% less memory than Fast DDS in single-node configurations.
For nodes with timing requirements, use the StaticSingleThreadedExecutor to avoid dynamic allocation during spin and combine with a real-time Linux thread:
#include "rclcpp/rclcpp.hpp"
#include <pthread.h>
#include <sched.h>
int main(int argc, char * argv[]) {
rclcpp::init(argc, argv);
auto node = std::make_shared<MotorControlNode>();
rclcpp::executors::StaticSingleThreadedExecutor exec;
exec.add_node(node);
// Elevate thread priority
struct sched_param param = {.sched_priority = 80};
pthread_setschedparam(pthread_self(), SCHED_FIFO, ¶m);
exec.spin();
rclcpp::shutdown();
}
Build ROS 2 packages on a development machine and deploy binaries to the target to avoid the slow build times on single-board computers. Use QEMU user-mode emulation with a sysroot that matches your target:
colcon build --cmake-args -DCMAKE_TOOLCHAIN_FILE=arm64_toolchain.cmake -DCMAKE_SYSROOT=/opt/sysroots/aarch64-rpi4
ROS 2 on embedded Linux is production-viable when you select the right DDS implementation, tune the middleware for your hardware, and use the appropriate executor for your timing requirements. The investment in a proper cross-compilation setup pays for itself immediately — compile on a fast development machine, deploy to the target, and iterate rapidly.
Continue reading — handpicked articles you might enjoy