According to the 2015 Urban Mobility Scorecard published by Texas A&M Transportation Institute and INRIX in August, the average American commuter spends 42 hours each year in traffic, a figure that nearly doubles for drivers in the Los Angeles area. But now, thanks to the prevalence of cell phones and the data points they produce, AT&T may have found a way to help.
Working in collaboration with the California Department of Transportation and the University of California, Berkeley, AT&T is undertaking two projects in California to develop a smarter traffic design using aggregate and anonymous cell phone data.
To better understand how the projects are looking to leverage data from the device in everyone’s pockets to forecast traffic patterns and reduce backups, we reached out to Assistant Vice President of AT&T Labs Chris Volinsky for more details.
[Wireless Week] What is the Connected Corridor project?
[Chris Volinsky] The overall project has two components: Connected Corridors and Smart Bay. The Connected Corridors study is designed to provide alternative routing to improve overall traffic flow specifically in the I-210 area of Los Angeles. Routes studied are included in a “play book” that traffic managers use to mitigate congestion areas. Based on information from the simulator, you can enhance and add new “plays” as needed.
The Smart Bay Project enables city planners to choose optimal locations for infrastructure projects or improvements. For example, using this technology, they could determine the best place for a tunnel or park and ride that will greatly ease mobility patterns in the San Francisco Bay area.
[WW] What is the role of cell phones in the project?
[CV] The aggregate and anonymous data from cell phones can help us better understand at a collective level how populations move about. Understanding this better can help us build a smarter, more efficient infrastructure. For instance, this data could help traffic managers create “play books” (described above) that chart traffic patterns and volume. So if a traffic accident occurred, it would allow an adjustment to be made to traffic lights to ease traffic flow.
[WW] When was the project started or when did the data collection for this project begin (or when will it, if it hasn’t started already)?
[CV] AT&T’s formal collaboration with UC Berkeley began in May 2014. The data is from November to December 2014 and June 2015. We’re currently building models to determine how aggregate and anonymous data source perform with different models.
[WW] How is data being collected and from who?
[CV] This type of data is created naturally through the service (AT&T) provides to our customers. All data used in this study is aggregate and anonymous records received from cell towers located in the study’s geographic area of interest.
We take several steps to ensure customers’ privacy. First, we anonymize the data, and then we aggregate it. We also inject “noise” – or random information – into the dataset. This helps to further strengthen our privacy efforts. It’s important to understand anonymized call detail records (CDRs) do not contain names or real phone numbers. The telephone number is replaced by a randomly generated number, and we are also obfuscating the time stamp.
Before aggregation, we move CDRs location information to a random location within a traffic analysis zone (TAZ) close to the original cell tower where the call occurred. This ensures further anonymization and that it is impossible to associate any individual call to specific person within the original zone. Next, these randomly displaced calls are fed into the simulator, which aggregates the data.
The output that is produced from the system is a set of arrows that moves across the screen to indicate traffic flow. One arrow represents an aggregation of many simulated – or virtual – CDRs. In essence, each arrow in the simulation is not associated with a single CDR. Instead these are random samples that are created from a probability distribution that is constructed from a set of CDRs in a given area.
[WW] Does the data come only from AT&T users or from cell phones across the carrier spectrum?
[CV] The data comes only from AT&T customers. This gives us a statically valid sample of the population, given that we know our market share in any given area. From the sample population, we’re able to identify trends and apply it to the broader population.
[WW] Is the study only tracking a specific set of users who have opted in or all cell phone users outlined in the previous question?
[CV] The study involves only AT&T customers, except those who have opted out, per our privacy policy. To reiterate, all data in this study is aggregate and anonymous. Furthermore, we obfuscate the data even before it goes into the simulator as an additional precaution to preserve anonymity. As mentioned above, it’s important to note that any customer can opt out of this. We hold their privacy in deep regard. Customer privacy is a fundamental commitment at AT&T that’s why the data used is anonymized and aggregated. Nothing that goes to UC Berkeley or Caltrans could be deduced about specific individuals or their behavior.
[WW] How is the data then being analyzed and put to use?
[CV] Our focus is at the macro level to understand broadly how populations move about. We’re in the research stage at this time and are currently building models to determine how aggregate and anonymous data source perform with different models.
Developing smarter cities that include smart traffic design is critical. That’s why we’re conducting the research. It could mean saving taxpayers’ money and easing transportation woes. While we are still in the feasibility stages of this, we’ve shown value and the data’s been shared through models that show results. It allows people to start talking about what implementation might be.
The goal of SmartBay is to use data to understand commuting trends and patterns in the Bay Area to inform Caltrans which will allow local traffic managers to be more efficient. Connected Corridors is aiming to inform strategies they’re going to enact when they see clog points in a network and predict how traffic will flow to create those play books. Ultimately, it will not be about giving real time data but giving data to create play books to route things when accidents or something else has created traffic.
So we’re working together with the University of California, Berkeley and Caltrans to better understand how aggregate and anonymous cell phone data can provide enhanced demand forecasts required for traffic control and transportation planning.