Connected Vehicle Data Wrangling: Leveraging ML, AI, and V2X Tech to Make Roads Safer

Episode 19
December 13, 2023
15:44

Episode Summary

In this groundbreaking episode of Point B, Shawn Turner, Senior Research Engineer at Texas A&M Transportation Institute (TTI), sheds light on the innovative use of connected vehicle data, machine learning, and AI in the realm of road safety. Delving into the intricacies of data collection and analysis, Shawn outlines the challenges in handling massive datasets, various use cases of connected vehicle data, and how data is leveraged to enhance road safety. He also talks about the role of V2X and V2I technologies and the importance of ensuring data privacy and protection in this rapidly evolving field. All this and more in this episode of Point B!

Shawn Turner is a Senior Research Engineer at Texas A&M Transportation Institute (TTI), where he has conducted and managed applied research for 32 years. Shawn is a nationally recognized expert with practical experience in multimodal travel data collection and analysis, performance measures and monitoring, and mobility analysis. In short, he is a data nerd. Shawn works with the public and private sectors to advance the use of the best available and high-quality data in transportation.

Key Highlights

1:34 What is TTI?

2:24 Data sources and collection

4:05 Data wrangling and integration 

4:45 Challenges associated with data collection, analysis, and anonymization

6:04 Use cases for connected vehicle data

8:06 Leveraging data to increase road safety 

9:40 Methods for data analysis

11:19 The role of machine learning and AI 

12:21 V2X, V2I, and near-future technology adoption 

13:25 Ensuring data privacy and protection

Meet Our Guests

Steve Schwinke
Steve Schwinke

Steve Schwinke is Vice President of Customer Engagement at Sibros, working closely with OEMs and Tier One suppliers to accelerate their connected vehicle solutions. He is a pioneer in the industry having spent 22 years at General Motors as an original Executive member of the OnStar team designing their first 3-button system, developing and launching numerous industry-first connected vehicle products and services. He is a recognized expert in connected vehicle technology having served on the Executive Board of Directors for the Telecommunications Industry Association and has been awarded 34 patents involving telecommunications, telematics, and navigation. Steve holds a Bachelor of Science in Electrical Engineering from the University of Michigan and a Master's of Science in Wireless Communication Systems from Santa Clara University.

Shawn Turner
Shawn Turner

Shawn Turner is a Senior Research Engineer at Texas A&M Transportation Institute (TTI), where he has conducted and managed applied research for 32 years. Shawn is a nationally recognized expert with practical experience in multimodal travel data collection and analysis, performance measures and monitoring, and mobility analysis. In short, he is a data nerd. Shawn works with the public and private sectors to advance the use of the best available and high-quality data in transportation.

Transcript

Shawn Turner:

I'm very excited about higher bandwidth to cars and hopefully seeing more progress in vehicles, talking to each other, and vehicles talking to the infrastructure. This is the V2I, the vehicles communicating with infrastructure, that's something that we've talked about in the public sector for a decade or more, but I think we've been slow to implement that.

Announcer:

Welcome back to another episode of Point B, a Sibros podcast where we interview industry experts about the latest innovations and trends in automotive technology and the connected vehicle industry.

Steve Schwinke:

Welcome to our Point B podcast, where we discuss the future of mobility and transportation products and services. My name is Steve Schwinke, Vice President of Customer Engagement at Sibros, and in today's episode, we're going to be exploring how connected mobility data is being used in transportation research. We'll be talking with Shawn Turner, Senior Research Engineer at Texas A&M Transportation Institute. Shawn is a nationally recognized expert with practical experience in multimodal travel, data collection, and analysis. In short, he's a data nerd. Shawn works with the public and private sectors to advance the use of the best available and high-quality data and transportation. Welcome, Shawn.

Shawn Turner:

Well, thank you, Steve. Excited to talk about data, one of my passions.

Steve Schwinke:

So let's just start with a little bit more background on Texas A&M Transportation Institute.

Shawn Turner:

Texas A&M Transportation Institute, or TTI as we're known, we're one of the largest, well-known university research institutes. We're a state agency, so we're part of the Texas A&M University system. Some TTI researchers also teach at the university, but most are like me, where they do full-time research. TTI, We're a bit over 400 researchers, and we really span all areas of transportation. We say the soft side, so the traditional transportation planning and operations, but also the hard side, the materials, the pavements, the guardrails.

Steve Schwinke:

What sources of data is TTI using and where are you getting this data from?

Shawn Turner:

Yeah, so traditionally we've had to go out and collect our own data, but more and more we are getting data from the private sector, from either third-party data aggregators, from OEMs, or from software-as-a-service platforms. So it really just depends. One of the things that we do, and one of my other things that I call myself, is I'm a data scavenger. So I'm constantly looking around for, hey, is there data that's maybe being used in another industry that might help us in transportation?

Whenever we're getting data from mobile devices, we're typically limited. We have in the past been limited to a few very basic items like the vehicle location, the speed, and the heading. And so with that, over the past decade or so, we've made lots of progress in how we are, say, measuring traffic congestion. What we know about travel patterns in the aggregate, how people travel to and from different areas, information that gives us a little glimpse into driver behavior. For example, we can see speeding, we see crashes, but we want to know what else is going on, what is leading up to these behaviors, what else is going on with the car, in the car? And that's what we hope to get from the newer generation of connected cars.

Steve Schwinke:

And then you're overlaying that with data outside the vehicle, like traffic light information or camera data from the roadways, and putting together an entire story?

Shawn Turner:

Yes, that's exactly right. I mean, so a lot of what we do is it's really what I call data wrangling and it's data integration. Typically we spend about 80% of our time on this data wrangling and data integration, getting everything together on the DOT's road network, and then we're spending about 20% of our time on the actual analysis.

Steve Schwinke:

Some of the challenges though that you have getting the data, like anonymizing the data or that, can you talk a little bit more about some of those?

Shawn Turner:

Challenges in terms of the data. So the area and the market around connected car data, I would say, is in the early stages. And so one of the ...  you know, I don't know that we have so many challenges with anonymizing the data. In some cases, data we're getting has already been aggregated, but even as researchers, we routinely deal with sensitive data sets about individuals, like crash records, health information, and so we've got these protocols in place to deal with sensitive data.

I would say one of the challenges that we face is getting the right data for the right application and for the use case. The other thing that we have a challenge with is that this is a growing industry, there's more and more vehicles being sold, there's changes in the processes of how this data is collected, and in some cases we want to do long-term trend monitoring. And so you're having to adjust for improvements as you're, so to speak, flying the plane.

Steve Schwinke:

Interesting. So tell me a little bit more about how this data is being used.

Shawn Turner:

I would say that we've gotten really good at identifying where the most congested roadways are, using that data to try and make improvements on these most congested roadways, and then being able to quickly identify the impacts of that. With the data sets that we had three years ago, we had this amazing glimpse into what happened when the pandemic hit in 2020. So that's where we've been, but where we're going right now and in the coming years is, again, it's this much richer data set that we're getting from all of these connected vehicle sensors. And so as I said before, it's better understanding driver behaviors. And then being able to quickly measure the results of improvements.

Typically, when we make a safety improvement, the measure of effectiveness is: did we reduce crashes? Well, statistically speaking, you have to wait a couple of years and we need quicker feedback.

And so connected cars and the richer data they provide us, they tell us about braking, about lane departures, about behaviors within the vehicle that give us much quicker feedback about whether we're doing a better job.

We're exploring other areas like road roughness. I think about the dash cams and embedded vision systems that can tell us, hey, when are the lane lines fading? Because they know, because they've got these lane keep assists and things like where the traffic signs are hard to read. Or maybe where signs were seen yesterday, but they're not seen today, and so something's happened, some signs have gotten knocked over.

Steve Schwinke:

You talked about crashes, but how do we use this data to make the roadway safer for everyone? We're starting to see more multimodal transportation. We're seeing people on bikes in these urban areas, pedestrians. Are you using data to, as I say, make the roadway safer for everyone?

Shawn Turner:

That is one of the areas, and that's actually another one of my passions is vulnerable road users. I am a cyclist, and I can tell you, as a transportation engineer and a cyclist, traditionally we have not collected much data or information about people walking or biking on our transportation system. It's just our focus and 90% of people are driving cars. But in major cities, there are a lot of people that are walking and biking.

Again, things like these computer vision systems that can actually see when pedestrians or cyclists are using the roadway or perhaps when there are near misses. And then being able ... letting cities or departments of transportation know about some of these areas proactively where near misses are occurring before something tragic happens; and motorcycle fatalities are way up in the U.S., and that is definitely one of the focus areas for safety. So I mean, the more we know about what's happening, the better informed we are at trying to fix the problems.

Steve Schwinke:

What methods are you using to analyze the data?

Shawn Turner:

I'll walk you through a typical safety study. We're pulling together all these different pieces of the puzzle. We've got the crash data, the roadway inventory, the traffic counts – those are mostly within public agencies, within departments of transportation. We've got speeds, that's coming ... We can get speeds from a number of different data providers. Now we've got driver behavior, hard-breaking, lane departures, other things like that that we can get from connected cars. Put that all in the same network. So in safety, we use this – I'm going to give you a $5 word – we use this empirical Bayes. It's a statistical approach that, in simple terms, it develops predictions about what you would expect to see in terms of safety, and then it compares it to what you're actually seeing.

And so what it tells you is where do you have the locations that are much worse than expected? And so a lot of what state DOTs are doing is they're trying to take ... in Texas, they're trying to take 80,000 miles of roadway, and they're try and say, okay, we've got several thousand fatalities. Where do we start? Where are the areas that we focus on? And so it's a matter of trying to find those areas where not necessarily the most crashes are occurring, but where we have the highest potential to be able to reduce those crashes.

Steve Schwinke:

Let me ask you what everyone's thinking about these days: how is machine learning and AI coming into play with regards to your analysis?

Shawn Turner:

We use machine learning a lot to develop models, and we're using computer vision and open-source tools, AI, to be able to extract features from video. So it's basically they are becoming a part of our standard toolbox. I'm a data nerd, I always will be, but I don't have the skills that some of the younger kids coming out of college have. One of the things that's amazing that I'm seeing is this ability to speak natural language and have it write code for me. I want to summarize the data this way, write the code for me, and allow me to create customized scripts for what I want to do.

Steve Schwinke:

Let's talk about some future or even near-term technology adoption that you see coming that's going to help with your mission of making our roadways safer, and I think the first thing that we want to talk about is VITAX technology.

Shawn Turner:

I'm very excited about higher bandwidth to cars and hopefully seeing more progress in vehicles, talking to each other, and vehicles talking to the infrastructure. This is the V2I, the vehicles communicating with infrastructure, that's something that we've talked about in the public sector for a decade or more, but I think we've been slow to implement that. And the more that different transportation system users can share information amongst themselves and the roadside, I think the better off we will be.

Steve Schwinke:

Can you talk a little bit more about data privacy and some of the things that you're doing in that area to still get the information that you need but make sure that the public interest is not being violated?

Shawn Turner:

Number one, we want to respect people's privacy, and in fact, one of the state laws that was recently enacted within Texas two years ago was SB State Bill 475, which says that state agencies like TTI can only use opt-in data. So that's really a requirement for us for any data that we use is it's got to be opt-in, and the providers if asked by us, have to be willing to provide sufficient information about their consent management. The second thing I will say about privacy is we're not trying to sell anything. We're trying to make roads safer. We're trying to save your life, save your mother's life, make sure that you can get home safely for your kids' soccer games, that you can get through traffic signals without having to listen to half a podcast. So we look at it as we're doing this for public safety and public interest.

Steve Schwinke:

So Shawn, I want to thank you for being a guest here on Point B, fascinating work that you're doing at TTI. I wish you the best of luck in the future, and I hope to have you back on my podcast again where we can talk a little bit more about 5G, those higher bandwidths, and some of the additional data sources that you have, helping you out, feeding your mission to make our roadways safer and less congested. And really just improving the situation, not only in Texas, but in the U.S. and globally. So thank you for being a guest here today.

Shawn Turner:

It was nice speaking with you and talking about one of my passions.

Announcer:

Thank you for tuning in to Point B. Join us next time for more auto-tech innovations and trends. Point B is brought to you by Sibros.