I selected this paper as a great case study on the need to consider adversarial scenarios when deploying IoT and smart city systems. It was also an eye opener to me just how quickly the U.S. Department of Transport (USDOT) is planning to roll out connected vehicle technology. In 2016 the USDOT awarded $45M to start small-scale deployment of connection vehicle (CV) systems as part of CV Pilot Deployment Program. There are three live sites in New York, Anthem, and Palo Alto (of course!) as part of the Intelligent Traffic Signal Signal (I-SIG) using information from connected vehicles to influence the timing of traffic signals. I-SIG has been shown to reduce traffic delays by 26.6%. The USDOT has proposed to mandate all new new light-duty vehicles be equipped with CV technology. Even so, it is estimated it will take 25-30 years to reach at least 95% penetration of all vehicles on the roads.
Privacy issues aside, the improvement in traffic flow is significant. But this assumes everyone is playing by the rules. What if an adversary — perhaps just a single vehicle — tries to game the system? Maybe to try and speed their own passage through traffic light junctions, or perhaps to deliberately cause congestion and bring traffic to a halt. The authors look at data spoofing attacks on the I-SIG system, targeting design or implementation issues in the signal control algorithm (i.e., not relying on implementation bugs). As you could probably guess, they find exploitable weaknesses!
- Using just a single attack vehicle, the total traffic delay at a junction can be increased by up to 68.1%, completely reversing the expected benefit of the I-SIG system.
- Attacks can also completely jam entire approaches — in the example below vehicles queuing in left-turn lanes spill over and block through lanes, causing massive traffic jams. 22% of vehicles take over 7 minutes to get through the junction, when it should take 30 seconds.
As shown in our study, event though the I-SIG system has shown high effectiveness in benign settings, the current algorithm design and configuration choices are highly vulnerable to data spoofing.
Connected vehicle and I-SIG background
Data transmission takes place via the Dedicated Short Range Communications (DSRC) protocol, with DSRC devices embedded in connected vehicles via on-board units (OBUs), and in roadside infrastructure via roadside units (RBUs). Communication can be vehicle-to-vehicle or vehicle-to-infrastructure. Here we’re concerned with the latter case. Vehicles periodically broadcast Basic Safety Messages (BSM) which include information on their real-time trajectory (location and speed). BSM messages are signed using a PKI system.
I-SIG uses BSM messages to perform more effective signal control at a junction.
- BSM messages are received by a trajectory awareness component which maintains the latest trajectory for each vehicle indexed by vehicle ID, and also assign vehicles to requested traffic light phases based on the intersection map.
- At the beginning of each stage in the light sequence, the signal planning component pulls real-time trajectory information for the vehicles in the intersection, performs planning, and sends signal control commands to the controller.
There are two algorithms used for signal planning: COP and EVLS! COP stands for controlled optimization of phases. It takes as input each vehicles estimated arrival time at the junction, and uses dynamic programming to calculate an optimal signal plan with the least estimated total delay. Total delays are estimated using a queuing model. If there are no vehicles requesting a certain phase, COP will skip the phase altogether. In theory, COP can plan over an unlimited number of stages, but in practice it is configured to look ahead only a set number of stages (two). This is due to a combination of the relatively low-powered hardware used in the roadside units (to keep costs down) and the real-time constraints of needing planning to finish in sufficient time (typically 5-7 seconds).
COP ideally wants all vehicles to be equipped with broadcasting equipment, and its effectiveness is greatly reduced when the portion of equipped vehicles falls below about 95%. EVLS (Estimation of location and speed) is used to bridge the gap before penetration reaches this level. EVLS uses the trajectory data that is available from equipped vehicles to estimate the trajectories of unequipped vehicles.
It is assumed that an attacker can compromise the on-board system in their own vehicle to send malicious BSM messages to the RSUs. “We do not assume that the attackers can spoof the sender identities in the BSM messages.”
To maximize the realism of our threat model, in this paper we assume that only one attack vehicle presents at an intersection. Since the COP algorithm targets optimized total delay for all vehicles in an intersection, which normally have over 100 of them, it should be very challenging for the data from one single vehicle to significantly influence the signal planning.
(But not as challenging as expected, it turns out!).
The attack vehicle doesn’t necessarily have to be in the traffic flow, it could just park nearby, listen to BSM messages from other vehicles, and seek chances to launch attacks. At which point, it seems to me it doesn’t necessarily have to be a vehicle either, just so long as the attacker has a device that can simulate an OBU and broadcast over DSRC.
Data spoofing attacks and the unusual influence of ‘the last vehicle’
The data flow in an I-SIG system under attack looks like this:
All incoming data is first subject to a geofence check, so the attacker needs to perform reconnaissance to know the bounding box and only generate location data that will pass the check. After this hurdle, the attacker’s goal is to change the values in the arrival table so as to influence planning in the COP algorithm.
Strategy S1, which works both in a full deployment in which over 95% of vehicles are assumed to be connected, and in the transition period before that penetration rate (PR) is reached, is for an attacker to change the speed and location in its BSM message so as to target a given arrival table slot and increase its count by one.
Strategy S2 relies on confusing EVLS as it feeds into COP. Since this transition period is expected to last for the next 25-30 years, that seems to give plenty of opportunity! Manipulating the estimation results can influence the signal plan more significantly than simply changing one vehicles data as S1 does.
After some experimentation the authors determine that the most effective strategy for S2 is to target the queue estimation process. EVLS uses the available data from equipped vehicles to divide traffic in a lane into three regions: (i) the queuing region, including vehicles waiting in the queue with zero speed, (ii) the slow-down region which includes vehicles slowing down because of what is in front of them, and (iii) the free-flow region where vehicles are sufficiently far from the queue that they behave independently.
Among the three regions, we find that manipulating the estimation of the queuing region is most effective. The attacker can just set the speed to zero and set its location to the farthest possible point of the most empty lane within the geofence so that the lane can be fully filled with queuing vehicles after the estimation.
The very most successful attacks add a spoofed vehicle with a very late arrival time, causing the green light end time for its requested phase to be pushed out. This ‘last vehicle’ has an outsized effect on the signal planning.
In an unlimited COP implementation this wouldn’t have such a great impact, but when the planning horizon is limited to two stages there are limited opportunities in the plan to serve all vehicles, causing the planning to be significant affected by the late arriver.
Moving from theoretical analysis to the construction of real-time attacks, the authors assume an attacker with a 4-core laptop running a parallel I-SIG algorithm to try various data spoofing options and find the most effective one. It has a budget of about 4 seconds to determine the best BMS message to send in each window.
Testing using real traffic data from a junction and commercial-grade traffic simulation software (PTV VISSIM) the authors find that their attacks are very successful in creating traffic delays, including totally blocking approaches as described at the top of this post.
The COP algorithm as is, is only optimal once the deployment penetration rate gets above 95%. One option is to design a better algorithm!
Considering that the transition period is unavoidable and may last as long as 30 years, we believe this calls for a joint research effort among the transportation and security communities to design effective and robust signal control algorithms specifically for the transition period.
It will also be important to improve the performance of the hardware in roadside units to enable planning over more than two (ideally five) stages.
Finally, we can add redundant sources of information to the system:
…to ensure high effectiveness, data spoofing detection on the infrastructure side needs to rely on data sources that attackers cannot easily control, e.g., infrastructure-controlled sensors, to cross validate the data in BSM messages… for example, the vehicle detectors buried underneath the stop bar of each lane as used to measure aggregated traffic information in today’s traffic control.