Chasing Their (Long) Tails

Why scaling more test vehicles won’t get us a self-driving car.

By John Hayes

February 18, 2020

4 minute read

Congress reconvened last week to discuss self-driving car legislation. Testing was again at the heart of the debate. Automotive and technology companies sought to raise the number of Federal Motor Vehicle Safety Standard exemptions given out each year from 2,500 per company to 10,000 per company, as well as preempt states from making their own laws that might impede experimentation.

The argument is that more test cars on the road will ultimately deliver safer self-driving cars on a shorter timeline. Supporters believe scaling more permissive testing will get us there faster. Detractors fear for public safety, as more buggy, unproven cars take to our streets.

But this debate misses the crux of the problem. Nearly everyone agrees that observing lots of real-world driving situations is essential for training self-driving cars. And most people agree that testing unproven cars on the streets is dangerous. But this argument — and the technology approach of most self-driving car companies — conflates observation and testing. Observation and testing need not be packaged in the same vehicle. Separating observation and testing makes it possible to actually accelerate the progress of self-driving without risking public safety. But to do this, we need to start engineering self-driving cars differently.

The Current Process

Here’s how self-driving car companies build and test their cars today:

  1. Code software to drive a car.
  2. Build custom cars. Get permission for road testing. Pay drivers — in-vehicle, remote, and/or chase. Test performance. Observe real-world driving.
  3. Encounter bugs or new observations.
  4. Start over again from scratch.

Why It’s Not Working

The flaw in this development process is the massive dependency on driving. In this system, both observation and software testing are limited by the speed and quantity of driving around. It adds a time-consuming and expensive real-world barrier in what should be a rapid iterative software loop. Essentially, progress in self-driving is massively inhibited by…lots of self-driving.

As constructed, self-driving companies are poorly organized to overcome this dependency due to fundamental limitations in their observation and testing capabilities.

Observation is critically important. Nearly everyone agrees that self-driving is a “long tail problem.” No two drives are alike — there is a lot of strange stuff that happens on our roads. The scale of observation is critical to success — seeing more real-world stuff better informs software to make better decisions.

But these companies have put up extraordinary barriers to scaling observation:

  1. They need to build expensive custom cars.
  2. They must develop software to support them.
  3. They must request permits from local, state and/or federal regulators.
  4. They must pay drivers, remote drivers and chase drivers to go drive around.

Each of these steps is really expensive and time consuming.

In 11 years, Waymo has driven ~20M miles — which is the equivalent experience of only 200 people driving around in their everyday life in that same time frame. This simply isn’t very much data to train a computer to drive everywhere, in all conditions and scenarios — statistically, 200 people simply do not encounter that many unusual events. Perhaps it is no surprise that the car has yet to come to market.

Testing is also critically important. After observing a lot of driving behavior, the next step is translating those observations into driving actions and testing their efficacy — ensuring that the car works in every possible scenario.

Scale is also important for testing — the more scenarios available to test against, the higher the confidence in any given software release. These same issues — cars, permits and drivers — are barriers to scaling software testing in this system. This too makes testing extremely expensive and time-consuming. This is a radical departure from the ordinary development process of finding and fixing software bugs almost instantaneously, without having to leave the comfort of your desk.

The even bigger challenge with testing this particular system is repeatability. The core driving algorithm is constantly changing as new observations and responses are incorporated into the product. This is the benefit of a networked fleet, as all self-driving vehicles will continue learning and improving with experience. However, every software update needs to be tested. Prior experience driving was conducted with old software. Now the new software must be subjected to the same breadth of testing in order to prove its performance. This means every software release starts at zero miles driven.

In this system where every software update means you start testing from scratch, testing against a large number of scenarios is prohibitively expensive and time-consuming, so it is not really done. Any car on the road has at most a few thousand miles tested on the latest software release.

This process has failed to deliver a self-driving car over the past decade. It has also put the public in danger, already claiming one pedestrian’s life. Yet the proposed solution this week in Congress is a lot more of the same, asking to increase the size of the fleet and with it increase the risk to public safety.

Measure Twice, Cut Once

The fundamental error in this development process is the conflation of observation and testing. When these two processes are separated, a totally new development paradigm emerges.

A better way to scale observation is to do it independently of a self-driving car. Simply recording normal people driving around in a regular car provides a treasure trove of useful real-world data, no self-driving car required. This dramatically scales observation without adding any risk to public safety at a much lower cost. This is a structural advantage of a company like Tesla — their fleet is hundreds of thousands of cars, not hundreds. They collect real-world data from almost everywhere without having to self-drive everywhere.

This collection of real-world driving data can then be used to train and test a driving algorithm. No longer is a company forced to test on the street — it is possible to use recorded real-world observation to verify the quality of a driver before it ever goes out in the public. The data is also durable, enabling the creation of a long-term asset for repeated training and testing. It is now possible to test your latest software release on millions or even hundreds of millions of real-world miles in a matter of just a few minutes.

This is a faster, cheaper and ultimately safer way to bring self-driving cars to market. Other self-driving companies are looking to Congress to solve what is ultimately an engineering problem, falsely pitting technological progress against public safety in the process.

With better engineering, we can have both.