In the Pan-European Railway Data Factory, sensor data is collected and processed across Europe to make it available for different players in the railway sector. In the Pan-European Railway Data Factory, sensor data is collected and processed across Europe to make it available for different players in the railway sector.

Pan-European Railway Data Factory creates basis for AI-based railroad operations

The European railway sector is currently facing one of the biggest technological leaps in its history: Many railway infrastructure managers and railway companies are striving for a high degree of automation and digitization of railroad operations. The aim is to significantly increase the capacity and reliability of rail operations. In Germany, the Digitale Schiene Deutschland sector initiative (DSD) has been set up for this.

Automation and digitization go hand in hand with the increasing generation and use of data in the rail system. Especially in the context of fully automated driving (so-called automation level 4, GoA4), sensors and cameras are needed to be able to react automatically to hazards in the rail environment by means of artificial intelligence (AI). For example, to develop such a AI software for environment perception, it requires very large amounts of data with very high data quality. However, it appears that individual rail companies or railway vendors will not be able to collect enough sensor data to sufficiently train AI for fully automated rail operations.

For this reason, it is assumed that a kind of pan-European Railway Data Factory is needed – an ecosystem with a common infrastructure that enables railroad companies and suppliers to collect and process sensor data and make it available for shared use. In this way, simulations can be carried out and AI models can be trained, certified and used in automated railroad operations.


The CEF2 RailDataFactory project is a study jointly conducted by DB, SNCF and NS and co-funded by the European Health and Digital Executive Agency (HADEA). Similar programs also exist in other European countries, same as large EU-wide initiatives like Europe’s Rail. It aims to assess the feasibility of a pan-European Railway Data Factory from a technical, economic, legal, regulatory and operational perspective. The aim is to identify the key aspects that need to be addressed for the pan-European Railway Data Factory to be successful.

In particular, the study aims to

  • identify the main operational scenarios and use cases that a data factory should cover;
  • derive the requirements of these scenarios and use cases for the data factory infrastructure (in particular w.r.t. the Pan-European Railway Data Factory Backbone Network, security, data and IT platforms, etc.);
  • determine legal and regulatory aspects to be considered as well as a possible economic incentive model for the Data Factory;
  • Making concrete recommendations on how a pan-European Railway Data Factory could be set up (possibly taking into account the advantages and disadvantages of different options if no single recommendation can be made).

The study started in January 2023 and is expected to be completed in September 2023. Through an established Rail Advisory Board and tight synchronization with Data Factory-related activities in the EU-Rail FA2 R2DATO project, it is ensured that the study well reflects the needs of the rail sector and is aligned with similar activities.

In the first deliverable, "D1 - Data Factory Concept, Use Cases and Requirements", the project partners describe the basic concept of the planned pan-European Railway Data Factory. In addition, the identified operational scenarios and use cases are listed. From this, requirements are derived, e. g. for the required pan-European Backbone Network and for legal, regulatory or IT security aspects that need to be considered. In the meantime, three more study results are available (see link).


The deliverables are here available: