Skip to main content Scroll Top

OptiPEx’s Living Lab Data Collection

1765286105995 (1)

OptiPEx’s Living Lab Data Collection

OptiPEx’s Living Lab data collection involves edge processing of camera and other sensor data. This process strictly adheres to GDPR compliance. Read the below blogpost by OptiPEx partner Tiia Järvenpää from Teleste to find out more.

Algorithm development for public transport under GDPR

GDPR (General Data Protection Regulation) protects the privacy of EU citizens by regulating how organizations collect, use and store personal data. It gives individuals rights, such as the ability to restrict how their data is used. Organizations are still allowed to collect personal data but they must clearly define the purpose of collection and ensure the data is handled securely.

Measuring passenger behaviour onboard public transport vehicles inherently involves observing real people, including their movements, interactions and responses to different conditions. Video or audio recordings that capture these behaviours may include identifiable features such as face, body shape or voice, which make them personal data under GDPR.

To develop and validate algorithms that improve safety, crowd management and accessibility, OptiPEx must work with data that reflects authentic human behaviour. Because of this, video and audio recordings used in the project must be created and handled according to GDPR rules.

Video and audio recorded during normal public transport operation can only be used for specific legal reasons, such as preventing and detecting vandalism and aggressive behaviour, and investigating incidents. Algorithm development is not a legal reason for using this kind of data, unless everyone included in the recordings has given their permission to use their personal data for such reasons. Synthetic data can assist in algorithm development but cannot fully replace real-world variability. Therefore, controlled, consent-based recordings are necessary to ensure both technical reliability and full GDPR compliance.

Options for GDPR-compliant video data

1. Using open datasets

There are publicly available datasets related to human behaviour. However, these datasets are often released under restrictive licenses that limit their use to academic or non‑commercial research. Additionally, they may not match practical needs, such as those regarding camera placement, video quality or scenarios. Finding an open dataset that is both legally usable for commercial purposes and technically suitable is uncommon.

2. Recording videos in controlled conditions

Recording new video data in a controlled setting is a very reliable and GDPR‑compliant option. In this approach, only individuals who have been informed in advance and who have given explicit consent are recorded. Organizing such recordings requires planning and resources, but the resulting datasets are highly valuable and tailored to the exact needs of algorithm development.

The planning phase is crucial to ensure legal compliance, realism and data quality. Preparation should include at least:

  • Selecting a suitable recording location
  • Recruiting participants
  • Collecting explicit informed consent
  • Designing recording scenarios or manuscripts that participants follow
3. Generating synthetic videos using AI

AI‑generated video is an increasingly attractive complement or alternative. If videos are created without using real people, they do not contain personal data and GDPR does not apply. Synthetic data can be generated quickly compared to real recordings and it is especially useful for dangerous or ethically difficult scenarios that would be hard or unsafe to record with real participants.

A hybrid approach is often the most effective: open datasets can support early proof‑of‑concept work and consent‑based recordings provide realism and address the need in more detail. Synthetic data can be used to expand coverage and volume.

OptiPEx: Recording events as a practical example

In the OptiPEx project, controlled recording has been a key method for generating realistic and GDPR‑compliant datasets. The project benefits from access to four living labs: trams in Tampere (Finland), Linz (Austria) and Zaragoza (Spain), and a shuttle bus in Chemnitz (Germany). These living labs provide realistic environments for testing new solutions and for recording consent‑based video material.

In May 2025, recording sessions were conducted using a tram in Linz, focusing on scenarios with varying passenger densities. In December 2025, similar recordings were carried out in Tampere, this time including additional elements such as bicycles, baby strollers and luggage. In both cases, groups of volunteers including students and consortium employees were recruited, and the tram operated within depot areas. Participants boarded and exited the tram according to predefined manuscripts, and video and audio were recorded.

These recordings of events fully complied with GDPR requirements while producing high‑quality data from real public transport vehicles. As a result, the OptiPEx consortium now has access to valuable, realistic datasets that directly support algorithm development and testing without compromising passenger privacy.