ROS2 with multiple machines doesn't work properly with point clouds
Hi,
I'm trying to get ROS2 to work in a multiple machine configuration.
Configuration:
2 machines:
- Embedded PC
- Developpment PC
Ubuntu 22.04 + ROS2 humble on both machines. Both machines are on the same wifi network.
The embedded PC is mounted on a vehicle and I want to use my dev PC to visualize topics, point clouds, etc.
I could test succesfully the communication with simple demo nodes (demonodescpp talker and listener) running on both PCs.
The problem starts when I launch the ROS driver of a 3D LiDAR (Ouster OS1) on the embedded PC. As long as I don't subscribe to the point cloud (e.g with rviz2) on the devloppment PC, everything works fine: the point cloud shows up on rviz2 running on the embedded machine. But if I subscribe to the cloud using the dev PC, the point clouds are incomplete.
After taking a look at the Ouster ROS driver (https://github.com/ouster-lidar/ouster-ros/tree/ros2), I could specify the source of the problem. The driver is composed of 2 nodes:
- os_sensor: receives UDP packets from the LiDAR and publishes them to a topic /lidar_packets
- os_cloud: subsrcibes to the packets and (/lidar_packets) and assembles them to build the complete point clouds (64 packets = 1 point cloud). Then it publishes the clouds to the topic /points.
Displaying the frequency of /lidarpackets with ros2 topic hz, I noticed the frequency was way lower when I subscribe to the point clouds (/points) on the dev machine. My assumption is that when I subscribe to /points on the dev machine, the high amount of data transmitted overloads the network. It would make some /lidarpackets not to be transmitted to os_cloud and thus cause the LiDAR packets to be incomplete.
I searched online how the ROS2 DDS layer works but I could not understand how it could explain the issue I face. I attempted to run a discovery server listening on all interfaces on the embedded PC (see https://docs.ros.org/en/foxy/Tutorials/Advanced/Discovery-Server/Discovery-Server.html) but it gave me the same result.
I suspect the wifi router performance to be part of the problem but I think even if it is the case such a behavior must not happen: the functionning of the embedded PC should not be affected by another PC used for visualization. If all messages can not be transmitted to the dev PC in time due to the wifi connexion bandwidth, I'd prefer some /scan messages to be skipped by the dev PC rather than /lidar_packets messages which belong to the internal functioning of the embedded system.
How can I diagnose better the source of the problem ? Is there a way to archieve the behaviour I want ?
Thanks!
Asked by arusso on 2023-06-20 09:52:07 UTC
Comments
Related/similar discussion: ros2/ros2#1434.
Asked by gvdhoorn on 2023-06-20 11:22:55 UTC
Thank you for this lead! I'm not sure it's the same problem because the one they face seems related to the use of subscribers with differents QoS for the same topic. In my case all subscribers are Best Effort plus there is only 1 subscriber per topic. Anyway, I'll try the workaround with DDS Routers they mention.
Asked by arusso on 2023-06-23 09:46:14 UTC