Speed prediction in large and dynamic traffic sensor networks

doi:10.1016/j.is.2019.101444

Information Systems

Volume 98, May 2021, 101444

https://doi.org/10.1016/j.is.2019.101444 Get rights and content

Highlights

•
Dynamic traffic sensor networks bring challenges in the context of urban mobility
•
We evaluate three approaches for speed prediction over large/dynamic sensor networks
•
The global and cluster-based approaches provide accurate and robust prediction models
•
The global approach solves the cold start problem
•
We provide a large dataset and assess the effectiveness of the three approaches

Abstract

Smart cities are nowadays equipped with pervasive networks of sensors that monitor traffic in real-time and record huge volumes of traffic data. These datasets constitute a rich source of information that can be used to extract knowledge useful for municipalities and citizens. In this paper we are interested in exploiting such data to estimate future speed in traffic sensor networks, as accurate predictions have the potential to enhance decision making capabilities of traffic management systems. Building effective speed prediction models in large cities poses important challenges that stem from the complexity of traffic patterns, the number of traffic sensors typically deployed, and the evolving nature of sensor networks. Indeed, sensors are frequently added to monitor new road segments or replaced/removed due to different reasons (e.g., maintenance). Exploiting a large number of sensors for effective speed prediction thus requires smart solutions to collect vast volumes of data and train effective prediction models. Furthermore, the dynamic nature of real-world sensor networks calls for solutions that are resilient not only to changes in traffic behavior, but also to changes in the network structure, where the cold start problem represents an important challenge. We study three different approaches in the context of large and dynamic sensor networks: local, global, and cluster-based. The local approach builds a specific prediction model for each sensor of the network. Conversely, the global approach builds a single prediction model for the whole sensor network. Finally, the cluster-based approach groups sensors into homogeneous clusters and generates a model for each cluster. We provide a large dataset, generated from $\sim$ 1.3 billion records collected by up to 272 sensors deployed in Fortaleza, Brazil, and use it to experimentally assess the effectiveness and resilience of prediction models built according to the three aforementioned approaches. The results show that the global and cluster-based approaches provide very accurate prediction models that prove to be robust to changes in traffic behavior and in the structure of sensor networks.

Introduction

Highly populated cities increasingly face mobility challenges caused by transport and traffic. The huge volume of data collected by real-time traffic monitoring sensors provides new opportunities to develop models and algorithms that enhance transportation services towards intelligent transportation systems, in particular those dealing with traffic predictions. Vehicle speeds on road networks are determined by complex traffic processes governed by stochastic and non-linear interactions between individual drivers [1], hence predicting the speed of vehicles is as complex as predicting the underlying traffic processes. Short-term traffic prediction techniques have been investigated and exploited since some time [2]. However, the emergence of smart cities, where urban areas are covered by massive amounts of sensors, combined with the development of transportation technologies, requires traffic prediction techniques that are fast, scalable, and suitable for complex and heterogeneous sensors networks like those deployed in smart cities. Many different traffic sensor technologies are currently used to monitor road networks, such as those based on inductive-loop detectors, magnetometers, video image processors, microwave radar sensors, laser radar sensors, passive infrared sensors, ultrasonic sensors, passive acoustic sensors, and devices exploiting combinations of the aforementioned technologies [3].

In this work we focus on sensors capable of capturing the speed of vehicles traveling over large and dynamic road networks, where sensors can be added or removed from the network for various reasons, and address the problem of training accurate prediction models that are capable of maintaining their accuracy over time – we call this the model aging problem – and cope with structural changes affecting sensor networks — we call this the network dynamicity problem. In this context we assume that sensors collect their observations in the form of textual data and periodically send such information to a centralized entity. We also assume that some centralized entity is in charge of training prediction models according to the available sensor observations.

We address these challenges by proposing and analyzing three different approaches that can be used to train machine-learned prediction functions: local, global, and cluster-based. The local approach is the solution commonly used in the literature, where each sensor is considered separately from others to train a specific predictive function. This approach suffers the cold start problem and therefore hardly applies to dynamic sensor networks, where sensors may be continuously added and removed on a daily basis. Moreover, in large and dynamic sensor networks the local approach requires to train and maintain a large amount of different prediction models. To overcome these issues we propose the global and cluster-based approaches, where models are trained on data coming from all the sensors in the network (or groups of similar sensors, in the cluster-based case) to build resilient predictive functions. The global approach provides substantial benefits in terms of reduced complexity and costs. Furthermore, by relying on a single prediction function that is independent from specific sensors, the global approach naturally solves the cold start problem. Moreover, the global approach is expected to be robust with respect to structural changes occurring in sensor networks, thus also addressing the dynamicity problem.

We also tested a cluster-based approach to prove its potential in representing a viable compromise between the local and global approaches. Specifically, the cluster-based approach trains distinct predictive functions for groups of similar sensors, where sensors are clustered according to some similarity metric; depending on the number of clusters, the behavior of this approach resembles the one of the local approach (when a high number of clusters is used) or the behavior of the global one (when few clusters are used). From the experimental evaluation we cannot conclude yet that this approach indeed represents a good compromise, since results are discordant and further work is needed. The contributions of this paper can be summarized as follows:

•
we propose the global and cluster-based approaches for learning vehicle speed prediction functions in large and dynamic sensor networks.
•
driven by three experimental questions, we provide a comprehensive evaluation to assess the effectiveness of the predictive models trained according to the three approaches. The training is conducted by using different state-of-the-art machine learning algorithms on a large, real-world sensors dataset. The dataset covers a time span of 12 months, during which 130 (145) sensors were added (removed) to (from) the network. The evaluation shows that the models created using the global approach represent good solutions when dealing with dynamic sensor networks, as they prove to be accurate and resilient both to model aging and to structural changes in the sensor infrastructure (which, in turn, includes the cold start problem).
•
We release to the scientific community the real-world dataset used to assess our proposals. The dataset originates from $\sim$ 1.3 billion records collected during the whole 2014 by 272 different road traffic sensors deployed in the city of Fortaleza, Brazil. Due to privacy concerns we do not release the original raw data, but a dataset obtained after an aggregation and cleaning process. To the best of our knowledge, this is the largest and richest dataset made publicly available for research on speed prediction in dynamic sensor networks.

The paper is structured as follows: Section 2 reports an overview of the related works dealing with the traffic prediction problem. Section 3 defines our prediction problem and discusses three approaches to solve the problem. Section 4 presents the dataset used in our experiments, as well as the pre-processing steps used to transform the data into a format suitable for speed prediction. Section 5 details the experimental evaluation and discusses the results. Finally, Section 6 draws the final conclusions and sketches potential lines of future research.

Section snippets

Related work

Short-term traffic prediction aims at estimating traffic conditions from few seconds to few hours in the future, based on current and past traffic information. The field has an extensive and longstanding research history that originates in the 1980s in the context of intelligent transportation systems. A comprehensive and recent survey [2] observes how this research area moved from a classical statistical perspective (e.g. ARIMA) to data-driven modeling techniques based on machine learning and

Problem definition

Let $S = {s_{1}, \dots, s_{n}}$ be a network of $n$ sensors overseeing the traffic conditions of a specific geographical area. Within a given time interval $T$ , sensors in $S$ produce a collection of observations, where each observation is a triple $(t_{j}, s_{j}, x_{s p e e d})$ recording the time $t_{j} \in T$ of the event of a vehicle passing by some sensor $s_{j} \in S$ with a speed $x_{s p e e d}$ .

Let us then denote by $O$ the set of average speed observations that are produced as follows: the whole time interval $T$ is split in time-buckets of fixed length

Dataset preparation

We evaluate the local, global, and cluster-based approaches introduced in Section 3 by means of a real-world dataset containing data from traffic sensors deployed in the city of Fortaleza (Brazil). The dataset is provided by Autarquia Municipal de Trânsito e Cidadania (AMC), the authority supervising Fortaleza’s road-network. The raw dataset consists of about 1.3 billions records, collected by a network of 302 sensors during the whole year of 2014, for a total of 60 GB of data. Each record is

Experimental evaluation

In this section we discuss the experiments conducted to generate different prediction models and assess their performance. More specifically, Section 5.1 introduces the experimental setting used to conduct the results evaluation, while Section 5.2 introduces the experimental questions and discusses the results.

Conclusion and future work

Traffic forecasts should be accurate and robust to changes in traffic monitoring networks. When such changes may occur, traffic management systems should optimize management and advisory strategies to enhance decision-making capabilities and maintain an appropriate level of service. In this context we consider the problem of predicting the speed of vehicles by analyzing data collected from a large and dynamic network of sensors, where sensing devices are continuously added and removed to the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is partially supported by FUNCAP SPU 8789771/ 2017, UFC-FASTEF 31/2019, BIGDATAGRAPES (EU H2020 RIA, grant agreement N^o̱780751), MASTER (H2020, MSCA grant agreement 777695) and the OK-INSAID (MIUR-PON 2018, grant agreement N^o̱ARS01_00917) projects. F. Lettich’s work has been supported by a University of Alberta’s Faculty of Science Research Grant.

References (35)

VlahogianniE.I. et al.
Short-term traffic forecasting: Where we are and where we’re going
Transp. Res. C
(2014)
ZhangY. et al.
A gradient boosting method to improve travel time prediction
Castro-NetoM. et al.
Online-svr for short-term traffic flow prediction under typical and atypical traffic conditions
Expert Syst. Appl.
(2009)
FriedmanJ.H.
Stochastic gradient boosting
Comput. Statist. Data Anal.
(2002)
MinW. et al.
Real-time road traffic prediction with spatio-temporal correlations
Transp. Res. C
(2011)
ZhangY. et al.
A comparative study of three multivariate short-term freeway traffic flow forecasting methods with missing data
J. Intell. Transp. Syst.
(2016)
HoogendoornS.P. et al.
State-of-the-art of vehicular traffic flow modelling
KleinL.A. et al.
Traffic Detector HandBook, Vol. iiTech. Rep.
(2006)
M. Wojnarski, P. Gora, M. Szczuka, H.S. Nguyen, J. Swietlicka, D. Zeinalipour, Ieee icdm 2010 contest: Tomtom traffic...
HamnerB.
Predicting travel times with context-dependent random forests by modeling local and aggregate traffic flow

HuangW. et al.

Deep architecture for traffic flow prediction: deep belief networks with multitask learning

IEEE Trans. Intell. Transp. Syst.

(2014)

LippiM. et al.

Short-term traffic flow forecasting: An experimental comparison of time-series analysis and supervised learning

IEEE Trans. Intell. Transp. Syst.

(2013)

WangD. et al.

Traffic flow forecast with urban transport network

LvY. et al.

Traffic flow prediction with big data: a deep learning approach

IEEE Trans. Intell. Transp. Syst.

(2015)

Voort Van DerM. et al.

Combining kohonen maps with arima time series models to forecast traffic flow

Transp. Res. C

(1996)

SunS. et al.

A bayesian network approach to traffic flow forecasting

IEEE Trans. Intell. Transp. Syst.

(2006)

ShuaiM. et al.

An online approach based on locally weighted learning for short-term traffic flow prediction

Cited by (9)

Traffic prediction using artificial intelligence: Review of recent advances and emerging opportunities
2022, Transportation Research Part C: Emerging Technologies
Traffic prediction plays a crucial role in alleviating traffic congestion which represents a critical problem globally, resulting in negative consequences such as lost hours of additional travel time and increased fuel consumption. Integrating emerging technologies into transportation systems provides opportunities for improving traffic prediction significantly and brings about new research problems. In order to lay the foundation for understanding the open research challenges in traffic prediction, this survey aims to provide a comprehensive overview of traffic prediction methodologies. Specifically, we focus on the recent advances and emerging research opportunities in Artificial Intelligence (AI)-based traffic prediction methods, due to their recent success and potential in traffic prediction, with an emphasis on multivariate traffic time series modeling. We first provide a list and explanation of the various data types and resources used in the literature. Next, the essential data preprocessing methods within the traffic prediction context are categorized, and the prediction methods and applications are subsequently summarized. Lastly, we present primary research challenges in traffic prediction and discuss some directions for future research.
Machine learning for spatial analyses in urban areas: a scoping review
2022, Sustainable Cities and Society
Citation Excerpt :
More broadly, Oke et al. (2019) studied urban typologies based on different urban dimensions to investigate the relationships between mobility and environmental sustainability. Most studies (13/31 papers) analyzed traffic characteristics for predicting traffic speed (Ma et al., 2017; Magalhaes et al., 2021), traffic congestion spots (Awan et al., 2021; Majumdar et al., 2021; Qin et al., 2020; Saldana-Perez et al., 2019), traffic flows (Moretti et al., 2015) and traffic flow in relation to air vehicle emissions (Alam et al., 2018; Nyhan et al., 2016), commuting patterns between cities (Spadon et al., 2019), and driving distance in relation to the built environment and demographic (Ding et al., and Næss (2018)). When studying road accidents and events, studies looked at how to predict short-term car crashes (Bao et al., 2019) or studied a way to detect traffic-related events (Alomari et al., 2021).
The challenges for sustainable cities to protect the environment, ensure economic growth, and maintain social justice have been widely recognized. Along with the digitization, availability of large datasets, Machine Learning (ML) and Artificial Intelligence (AI) are promising to revolutionize the way we analyze and plan urban areas, opening new opportunities for the sustainable city agenda. Especially urban spatial planning problems can benefit from ML approaches, leading to an increasing number of ML publications across different domains. What is missing is an overview of the most prominent domains in spatial urban ML along with a mapping of specific applied approaches. This paper aims to address this gap and guide researchers in the field of urban science and spatial data analysis to the most used methods and unexplored research gaps. We present a scoping review of ML studies that used geospatial data to analyze urban areas. Our review focuses on revealing the most prominent topics, data sources, ML methods and approaches to parameter selection. Furthermore, we determine the most prominent patterns and challenges in the use of ML. Through our analysis, we identify knowledge gaps in ML methods for spatial data science and data specifications to guide future research.
Blending Efficiency and Resilience in the Performance Assessment of Urban Intersections: A Novel Heuristic Informed by Literature Review
2024, Sustainability (Switzerland)
Fdm: Effective and Efficient Incident Detection on Sparse Trajectory Data
2023, SSRN
Traffic Prediction using Artificial Intelligence: Review of Recent Advances and Emerging Opportunities
2023, arXiv
A Systematic Review of Artificial Intelligence Applied to Facility Management in the Building Information Modeling Context and Future Research Directions
2022, Buildings

View all citing articles on Scopus

View full text

Speed prediction in large and dynamic traffic sensor networks

Highlights

Abstract

Introduction

Section snippets

Related work

Problem definition

Dataset preparation

Experimental evaluation

Conclusion and future work

Declaration of Competing Interest

Acknowledgments

Transp. Res. C

Expert Syst. Appl.

Comput. Statist. Data Anal.

Transp. Res. C

J. Intell. Transp. Syst.

State-of-the-art of vehicular traffic flow modelling

Traffic Detector HandBook, Vol. iiTech. Rep.

Predicting travel times with context-dependent random forests by modeling local and aggregate traffic flow

Deep architecture for traffic flow prediction: deep belief networks with multitask learning

IEEE Trans. Intell. Transp. Syst.

Short-term traffic flow forecasting: An experimental comparison of time-series analysis and supervised learning

IEEE Trans. Intell. Transp. Syst.

Traffic flow forecast with urban transport network

Traffic flow prediction with big data: a deep learning approach

IEEE Trans. Intell. Transp. Syst.

Combining kohonen maps with arima time series models to forecast traffic flow

Transp. Res. C

A bayesian network approach to traffic flow forecasting

IEEE Trans. Intell. Transp. Syst.

An online approach based on locally weighted learning for short-term traffic flow prediction