These presentations illustrate the wide range of applications of big data analytics to understand travel behavior and patterns. The applications include both international and U.S. studies, with analysis and insights on diverse topics including remote detection of illegal parking of e-scooters, the equity of developer impact fees, the route choice sets of taxis, origin-destination spatial patterns and repeated individual significant places, congestion, safety, and impacts of COVID-19 on travel.
Interactive, Web-based Platform for “Big” Transportation Data Integration and Analytics
Xiaofan Shu, University of Missouri, ColumbiaShow Abstract
Yaw Adu-Gyamfi, University of Missouri, Columbia
Carlos Sun, University of Missouri, Columbia, Ellis Library
Praveen Edara, University of Missouri, Columbia
An exponential growth in various transportation data streams has brought new opportunities and challenges in the realm of transportation data warehousing. Increased data has the potential to improve planning, monitoring, prediction, and management of transportation systems but only if the manipulation of such gigantic datasets could be automated efficiently. With increasing demand for modern data warehousing, there has been a significant growth in commercial and open-source tools. The current paper presents a completely open-sourced, web-based platform that leverages recent advances in big data to efficiently process multiple streams of transportation data and deploying a variety of applications that will enable transportation agencies to make practical, data-driven decisions. Using a Hadoop and Spark cluster, and the generation of Graphical Processing Units (GPUs), the developed platform is able to generate responses to different forms of user queries at a much faster rate compared to traditional data warehouses. A CPU-GPU architecture is proposed to enable large datasets to enable both visualization and analytics to be carried out seamlessly on a web browser. The platform has two main components: a data center that provides the capacity of storing large, heterogenous datasets, and an applications development center that enables users to visualize and analyze large datasets on a web browser. The developed platform is fast, taking approximately fractions of a seconds to run complex queries on probe, crash, detector and transit datasets. Its interactive visualization applications can render responses to visual queries at a rate of about 600 milliseconds per 10 million rows.
Going the Extra Mile: Using Connected Vehicle Data to Study Commute Patterns in Relation to Impact Fees
Aisling O'Reilly (email@example.com), CivilitudeShow Abstract
Daniel Hennessey, WGI
Jackson Archer, WGI
The City of Austin is considering the adoption of a street impact fee program, which would change the manner in which developers take responsibility for paying for their portion of growth on the City’s transportation network. In developing this program, the City split Austin into seventeen zones by which to determine the maximum impact fee that can be charged per state law. Using a day’s worth of vehicle trip data from connected vehicle data company Wejo, each zone’s vehicle miles traveled (VMT) data were assessed to determine the average length of trips during the morning commute period, evening commute period, and overall daily trip lengths. The purpose for these analyses was to determine if certain areas of the city showed drastically different VMT patterns than others and what that impact might be on street infrastructure. We found that specific zones characterized by low amounts of employment and housing, typically on the periphery of the city, consistently generated the highest average VMT, whereas central zones had the lowest average VMT. This ability to evaluate real-world data on travel patterns allows the City of Austin and other jurisdictions to consider VMT as a criterion for evaluating development, including the imposition of street impact fees. When developers choose to build in high impact zones (high average VMT), it may be appropriate for them to pay a higher proportion towards growth mitigation than low impact zones (low average VMT), depending on the jurisdiction’s priorities and the type of growth they hope to incentivize.
Crowd-sourcing Micro-mobility Parking Violation Reporting – User Interface Design Motivation and Analytical Opportunities from Data Collected
Chintan Pathak (firstname.lastname@example.org), University of WashingtonShow Abstract
Borna Arabkhedri, University of Washington
Don Mackenzie, University of Washington
A surge in shared micromobility services has been accompanied by an increase in vehicle parking violations and associated public complaints. Most micromobility vehicles are unable to automatically detect a parking infraction, residents do not have a unified method of reporting the parking violations of vehicles, and regulating agencies struggle to handle the volume of incoming reports. This paper introduces a shared micromobility parking infraction reporting tool that is geo-sensitive and utilizes the popular features of a user’s smartphone to deliver high-quality actionable reports to the companies and cities. The tool was informed by interviews with local government workers responsible for overseeing micromobility in their communities, and is intended to streamline and standardize the process for users to report micromobility parking problems. Copies of reports are stored in a database and can be viewed through a web-based dashboard. The paper closes with some illustrative analyses based on data collected in Seattle, Washington and Portland, Oregon.
A Big-Data Driven Approach to Analyzing and Modeling Human Mobility Trend under Non-Pharmaceutical Interventions during COVID-19 Pandemic
Songhua Hu, University of MarylandShow Abstract
Chenfeng Xiong (email@example.com)
Hannah Younes, University of Maryland, College Park
Weiyu Luo, University of Maryland
Lei Zhang, University of Maryland, College Park
During the unprecedented coronavirus disease 2019 (COVID-19) challenge, non-pharmaceutical interventions became a widely adopted strategy to limit physical movements and interactions to mitigate virus transmissions. For situational awareness and decision-support, quickly available yet accurate big-data analytics about human mobility and social distancing is invaluable to agencies and decision-makers. This paper presents a big-data-driven analytical framework that ingests terabytes of data daily and quantitatively assesses the human mobility trend during COVID-19. Using mobile device location data of over 100 million monthly active samples in the United States, the study successfully measures human mobility with three main metrics at the county level: daily average number of trips per person; daily average person-miles traveled; and daily percentage of residents staying home. A set of generalized additive mixed models is employed to disentangle the policy effect on human mobility from other confounding effects including virus effect, socio-demographic effect, weather effect, and spatiotemporal autocorrelation. Results reveal the policy plays a limited, time-decreasing, and region-specific effect on human movement. The stay-at-home orders only contribute to a 3%-7.3% decrease in human mobility, while the reopening guidelines lead to a 1%-4.7% mobility increase. Results also indicate a reasonable spatial heterogeneity among the U.S. counties, wherein the number of confirmed COVID-19 cases, income levels, age and racial distribution play important roles. The data informatics generated by the framework are made available to the public for a timely understanding of mobility trends and policy effects, as well as for time-sensitive decision support to further contain the spread of the virus.
Impacts of COVID-19 Pandemic on the Travel Behaviors of Free-Floating Electric Bike Sharing Service Users: An Unsupervised Learning Method
Seung Eun Choi, Yonsei UniversityShow Abstract
Jinhee Kim (firstname.lastname@example.org), Yonsei University
Dayoung Seo, Elecle
Yeonjin Cho, Elecle
Although real-time data collection enables accurate representations of specific time and location of trips, there are limitations to understanding trip behaviors and characteristics of each user. Such interpretations become extremely important in occurrence of life changing events such as the COVID-19 pandemic. This study focuses on observing trip behaviors of free-floating bike sharing system user groups before and after the occurrence of the pandemic. Multiple features are extracted from each user to explain the hidden characteristics of the data and an unsupervised learning method is proposed to cluster and evaluate similarities of those users with the extracted features. The study found evidence on an overall pattern change of commuters and increases in leisure-type travelers during the pandemic. The time of recovery for trips with specific purposes were also visible along with a finding that the users most consistent in behavior are firstmile, lastmile travelers.
Mining Route Set Distribution Range And Affecting Factor Threshold Based On GPS Data
Yajuan Deng (email@example.com), Chang'an UniversityShow Abstract
Sanghuiyu Yan, Chang'an University
Peng Zhang, China Design Group Co., Ltd.
Xianbiao Hu, Missouri University of Science and Technology
The traditional route generation algorithms may end up with a large number of routes which in reality few drivers would choose, whereas some other route generation algorithms need to determine thresholds for route set generation but lack data support. To avoid invalid route generation, reduce computation time, and provide a scientific basis for the generation of navigation routes and traffic assignment, this manuscript confines the route set size through mining the spatial distribution range of route sets and the threshold of factors affecting the route sets generation. This article first uses GPS data to determine hotspot based on a hotspot-OD-identification method, then mines the route set spatial distribution range by the standard deviational ellipse. Finally, the factors affecting the generation of route sets were selected, and the Classification and Regression Trees (CART) algorithm was used to mine their thresholds. The results show that the spatial distribution range of route set is ellipse, and the maximum number of left turns per kilometer that the driver can accept for short, medium, and long travel distances is 1.336, 0.812, and 0.434, respectively. The maximum travel time per kilometer is 1.398min, 3.594min, and 3.594min, respectively. The maximum number of road intersections per kilometer is 1.897, 1.407 and 1.412, respectively. The implications of results on reducing the search range and time of the route set, and using it for traffic network design and route navigation are also discussed.
Using Big Data and Interactive Maps for Long-term and COVID-era Congestion Monitoring in San Francisco
Bhargava Sana, San Francisco County Transportation Authority (SFCTA)Show Abstract
Xu Zhang, Kentucky Transportation Cabinet
Joe Castiglione, San Francisco County Transportation Authority (SFCTA)
Mei Chen, Kentucky Transportation Center
Dr. Gregory Erhardt, University of Kentucky
The San Francisco County Transportation Authority (SFCTA) monitors roadway performance as a part of the biennial Congestion Management Program (CMP). Recently, SFCTA switched from using Floating Car runs to INRIX probe data as the primary source for tracking roadway speeds and monitoring congestion. The transition resulted in considerable savings in time and effort. However, integrating and processing the probe data still required a significant amount of manual work. To improve the efficiency of the effort, the CMP team developed and implemented a data processing pipeline which included automated network conflation process, efficient big speed data processing framework, and an interactive web-based visualization. In addition, all the scripts and code developed were made open source and are readily accessible from a public repository on GitHub. With the accelerated spread of the COVID-19 pandemic and the subsequent shelter-in-place order in the San Francisco Bay Area, the region’s traffic conditions and congestion were rapidly altered. While congestion impacts of COVID-19 were evolving swiftly, the CMP data pipeline successfully facilitated the creation of an agile and timely web-based visualization - the COVID-Era Congestion Tracker. Furthermore, the pipeline enables updating the website on a weekly basis as the source data becomes available. Such processes and tools can help transit agencies and planning organizations keep track of and identify congestion patterns as they evolve.
Micromobility Trip Origin and Destination Inference using General Bikeshare Feed Specification (GBFS) data
Yiming Xu (firstname.lastname@example.org), University of FloridaShow Abstract
Xiang Yan, University of Florida
Virginia Sisiopiku, University of Alabama, Birmingham
Louis Merlin, Florida Atlantic University
Fangzhou Xing, Microsoft Corporation
Xilei Zhao, University of Florida
Emerging e-scooter services have a great potential to enhance urban mobility but greater knowledge on their usage patterns is needed. The General Bikeshare Feed Specification (GBFS) data provides researchers good opportunities to study e-scooter usage but efforts are needed to infer trips from the available data. However, the existing e-scooter trip inference methods are based upon the assumption that the vehicle ID of e-scooter does not change so that they cannot deal with data with changeable vehicle ID. In this study, we propose a comprehensive package of algorithms to infer trip origins and destinations for all existing vehicle ID types based on GBFS data in Washington D.C. The inference accuracy of the proposed algorithms are then evaluated by R-squared, mean absolute error, and sum absolute error. Using one-week of GBFS data published by six vendors in Washington D.C., trip origins and destinations are inferred using the proposed algorithms. The usage of e-scooter services has two peaks on weekdays, and these peaks are more prominent compared with previous studies, which indicates that e-scooter services may be becoming more frequently used for commuting trips. Many areas with high intensity of e-scooter usage are centered or close to metro stations, which suggests that e-scooter services are being used to address the ''first mile/last mile'' problem in coordination with heavy rail. For inference accuracy, R-squared of evaluated algorithms is larger than 0.9 with the 400m×400m grid, which indicates the proposed algorithms have good inference accuracy.
Mining Vehicle Trajectories to Discover Individual Significant Places: Case Study Using Floating Car Data in the Paris Region
Danyang Sun, Ecole des Ponts ParisTechShow Abstract
Fabien Leurent, Ecole des Ponts ParisTech
Xiaoyan Xie, Ecole des Ponts ParisTech
This study discovered individual mobility significant places by exploring vehicle trajectories from Floating Car Data. The objective consists of detecting the geo-locations of significant places and further identifying their functional types. Vehicle trajectories were firstly segmented into meaningful trips to recover corresponding stop points. A customized density-based clustering approach was implemented to cluster stay points into places and determine the significant ones for each individual vehicle. Upon that, a two-level hierarchy method was developed to identify the place types, which firstly identified the activity types by mixture model clustering on the staying characteristics; and secondly discovered the place types by assessing their profiles of activity composition and frequentation. An applicational case study was dealt with over the Paris region. As a result, 5 types of significant places were identified, including home place, work place, and 3 other types of secondary places. Results of the proposed method were compared with that from a commonly used rule-based identification, which showed a highly consistent matching on the place recognition for the same vehicles. Overall, this study provides a large-scale instance of studying human mobility anchors by mining passive trajectory data without prior knowledge. Such mined information can further help to understand human mobility regularities and facilitate city planning.
Operationalizing Mobile Data for Urban Transport: Lessons from Freetown, Sierra Leone
Dunstan Matekenya, The World BankShow Abstract
Fatima Arroyo Arroyo, The World Bank
Xavier Espinet Alegre, The World Bank
Marta Gonzalez, University of California, Berkeley
In recent years, researchers have demonstrated that digital footprints from mobile phones can be exploited to generate data useful for transport planning, disaster response, and other development activities—thanks mainly to the high penetration rate of mobile phones even in low-income regions. Most recently, in the effort to mitigate the spread of COVID19, this data can be used to track mobility patterns and monitor the results of lock-down measures. However, as rightly noted by other scholars, most of the work has been limited to proofs of concept or academic work: it is hard to point to any real-world use cases. In contrast, in this work, we share experiences of using mobile data in a developing country for the preparation of an urban mobility project in Freetown, Sierra Leone, funded by the World Bank. We share good practices in the following areas: accessing mobile data from telecom operators, frameworks for generating origin and destination matrices, and validation of results.
DISCLAIMER: All information shared in the TRB Annual Meeting Online Program is subject to change without notice. Changes, if necessary, will be updated in the Online Program and this page is the final authority on schedule information.