Skip to Main Content
Multisensor Dataset for AI-Based Environment Perception Published
2025/10/22

Multisensor Dataset for AI-Based Environment Perception Published

A comprehensive multisensor dataset provides the foundation for AI-based environment perception in fully automated rail operations. The dataset was jointly developed by DB InfraGO within the Digital Rail Germany sector initiative and understandAI GmbH (UAI) and contains over seven million precisely annotated recordings of real train journeys.

Environment Perception as a Key Technology for Automation

Reliable perception of the track environment is a central prerequisite for automated operation in the railway sector. Even at the automation level ATO GoA2 – with a driver still on board – assistance systems can significantly improve track monitoring. At ATO GoA4, the fully automated and driverless mode, the system must be capable of independently detecting its surroundings in real time and responding to potential hazards.

This is where AI-based methods come into play. To function reliably, such systems require large volumes of high-quality training data – realistic, diverse and precisely annotated. Annotations are markings within raw data used for training, testing and validating AI algorithms. The new multisensor dataset provides exactly this. It was created within the Digital Rail Germany (DSD) sector initiative by DB InfraGO AG in collaboration with understandAI GmbH (UAI).

To ensure the greatest possible data diversity, two different railway vehicles were equipped with extensive sensor technology:

Track Maintenance Vehicle (GAF)

As part of an internal DB project, a track maintenance vehicle was equipped with six RGB cameras, three infrared cameras, six LiDARs, one radar unit and additional sensors. Test runs were carried out in diverse infrastructural environments in Hamburg and Berlin.

Hamburg S-Bahn (BR 472)

In the Sensors4Rail project – with partners including Bosch Engineering, Siemens Mobility, HERE Technologies and MicroVision – an S-Bahn train was fitted with camera, Lidar and radar sensors along the 23-kilometre route of line S21.

7 Million Annotations Across 21 Object Classes

Since 2021, more than seven million annotations have been created across 21 object classes – including people, vehicles, signals, animals, bicycles, tracks and other relevant objects. The combination of automated annotation tools, manual post-processing and quality control ensures the highest data quality.

The annotations were performed on several levels – from 3D Lidar data to RGB and infrared images and radar data. Specific projection methods were used, such as converting 3D boxes into 2D bounding boxes or polygons. Additional metadata (e.g. object attributes) were added, and all annotations underwent a multi-stage verification process. UAI achieved a peak annotation throughput of 140,000 annotations per week.

Applications and Outlook

The multisensor dataset opens up a wide range of application fields:

• Training of AI-based environment perception systems
• Development of driver assistance systems (GoA2)
• Object detection for fully automated applications (GoA4)
• Infrastructure analysis and digital maintenance
• Simulation for autonomous train control

Work on the dataset has been completed, and it can now be made available to industry partners for the development of environment perception systems. This marks an important milestone and provides new momentum for the sector. The dataset can be made available to interested organisations and partners as a basis for further industrial development (requests via e-mail to martin.koeppel@deutschebahn.com or philipp.neumaier@deutschebahn.com).

Further information is available in a detailed article published in the trade journal Deine Bahn (August 2025 issue).

Link to article