Lightweight Acquisition and Large-scale Mining of Trajectory Data (Trajectories)

  • Stefan Funke
  • Sabine Storandt
    Sabine Storandt

Modern smartphones are equipped with an array of powerful sensors that can continuously sense the ambient space. In contrast GPS units, most of these sensors have very modest power requirements, so it is feasible to have them permanently turned on. For example, detecting nearby mobile network base stations, WiFi access points, and Bluetooth devices, or measuring acceleration, magnetic fields, and air pressure can be performed continuously with hardly affecting battery life of mobile devices. So in principle every mobile user is a potential source of continuous geospatial data that can be tapped into. The first goal of this proposed research is the systematic acquisition and processing of geospatial data from such lightweight sensors. Ideally, with everyone voluntarily contributing their sensor readings one could process this data into a humongous amount of fuzzy trajectory data possibly even enriched with other contextual information. Exploiting this huge pool of trajectory data has the potential to help with the solution of grand social challenges e.g. in the fields of environment and disaster management, health, transport and citizen participation. Unfortunately, the methodology to actually mine trajectory data on such a large scale is still in its infancy. Hence the second goal of this proposal is the development of suitable algorithms and data structures for efficient mining huge sets of trajectory data. Our results will also be of great benefit for other projects within the priority programme as we provide a basic toolbox to efficiently acquire and work with trajectory data.

Selected Publications

  • Seybold, M. P. (2017). Robust Map Matching for Heterogeneous Data via Dominance Decompositions. In N. Chawla & W. Wang, N. Chawla & W. Wang (Eds.), SDM (pp. 813–821). SIAM. Retrieved from [URL | BibTeX | BibSonomy]
  • Barth, F., Funke, S., & Storandt, S. (2019). Alternative Multicriteria Routes. In ALENEX. SIAM. [BibTeX | BibSonomy]
  • Funke, S., Rupp, T., Nusser, A., & Storandt, S. (2019). PATHFINDER: Storage and Indexing of Massive Trajectory Sets. In Proceedings of the 16th International Symposium on Spatial and Temporal Databases (pp. 90–99). Vienna, Austria: ACM. [DOI:10.1145/3340964.3340978 | URL | BibTeX | BibSonomy]

Data Sets and Benchmarks

We extracted raw trajectory data as well as the German road and path network from OpenStreetMap and map matched the trajectories to paths in the network using our novel map matching approach. The trajectory set contains about 372,000 trajectories consisting of a total of 350 million data points. The network consists of about 60 million nodes and 120 million edges. Further details are provided in the enclosed README file. Download Data (compressed 5.4GB, uncompressed 19 GB)

Further openly available trajectory sets:


  • OSCAR, Spatial search engine for OSM planet data