CES was once a celebration of progress and enthusiasm in the autonomous vehicle industry. Now it has become an annual reminder of how much things have stayed the same. Again, companies announced more miles driven, reimagined more car interiors, and made more promises on a 5–15 year timescale. But none of them announced a real self-driving car for sale, so little has actually changed.
It is clear that robotics, the approach that has dominated the past decade of AV development, has proven inadequate to solving the massive complexity of real-world driving.
But there is hope. Starting in 2017, new autonomous vehicle companies, including my company, Ghost, began taking an end-to-end machine learning approach to self-driving. Based on some of the latest breakthroughs in machine learning, this learning approach represents a totally new way to drive a car and a radical departure from robotics, which has failed autonomous vehicles for the past decade.
Today I want to go in depth on this change and share why I believe this new learning paradigm is the only approach that will ultimately put self-driving cars on the road safely en masse.
That Was Then: Robotics
The first autonomous vehicle companies were born out of the robotics departments of top universities, inspired by DARPA’s Grand and Urban Challenges. The Department of Defense sponsored these contests to explore the potential applications for unmanned vehicles. The DARPA obstacle courses were fairly simple (vs. the complexities of real-world driving) and, at the time, a rules-based software solution for moving a physical robot through a constrained environment was the best choice for solving the problem. Plus, machine learning as we know it had not yet been invented.
Robotics is fundamentally defined by constraints, which are painstakingly programmed by engineers. A look into the classic autonomous vehicle software stack reveals a long list of rules about what a computer cannot do: HD maps tell it where it can’t drive, object recognition tells it what it can’t hit, pre-programmed rules of the road tell it how it can’t behave. What’s left — the drivable surface minus impediments, is the path forward. But what worked well on an obstacle course has proven inadequate for the complexities of real-world driving.
When Robots Fail
Robotics works until your constraints come into conflict — when one of your rules forces you to break one of your other rules. But which rule to follow: should you stay in your lane or avoid a collision? In self-driving, people call these edge cases, the ‘exceptions’ that populate self-driving’s fat long tail. To be clear, edge cases are not in fact naturally occurring phenomena (like a deer running across the road), but instead the result of the software you write conflicting with other software you write (“stay in your lane” conflicts with “don’t hit deer”).
Unpacking and solving these edge cases is a complex and labor-intensive task, requiring that a person determines which rule takes precedence in every instance. It adds a new layer of human assumptions, where by definition you must de-emphasize something you have already deemed important. It’s hard work.
The bigger challenge, however, is scale. The number of edge cases increases by the square of the number of rules. If you have 100 rules in your model, and each individual rule can conflict with every other rule, you have 10,000 possible edge cases. The more complex the driving scenario, the more rules required to drive safely—and the number of edge cases explodes. A complex situation might require 1,000 rules, which would produce 1,000,000 possible edge cases. Resolving these by hand is nearly impossible.
If this strikes you as an unnatural way to drive, it should. Robotics quickly becomes gummed up by an overload of information. There is no scalable mechanism to filter or prioritize information, so the programs are constantly evaluating decisions in the context of thousands of rules and even more exceptions. Where people excel, robots are paralyzed.
Driving is a Learning Problem
Driving is not a rules problem, it is a learning problem — most of driving is determining what is important and what can be ignored. People can drive because we have become extremely good at filtering out noise, not because we have become especially good at memorizing lots of rules and exceptions.
The core challenge is that there is a lot of noise on the road. The total data input at a given moment on the road is massive, but only a small fraction of it is important in making your next driving decision. Consider the freeway — despite a constant stream of cars whizzing past you in both directions, moment by moment, almost all of your decisions are based on a few lane markers and the distance between you and the car in front of you.
But when a car swerves towards you from two lanes over, what was an irrelevant input moments ago becomes central to your split-second decision. Determining what data input is important at which moment is the heart of the task.
This sort of prioritization is what machine learning does best. Machine learning models take in lots and lots of information, discover their relevance, and optimize for a certain outcome. In driving, that means learning what signals are important and what signals can be ignored, without getting stuck sorting out every possible rule.
The Modern Learning Company
This idea represents a paradigm shift in developing driving logic for autonomous vehicles, one that can deliver an order of magnitude safer and better product.
In robotics, you write a bunch of rules, put them in a car and drive around to see if they work. It’s essentially “guess and check” — your models are based on whatever assumptions an engineer might make about driving. When it breaks, you solve the edge cases, writing exceptions by hand. Then you repeat the process. This is limiting in many ways — first, your assumptions are made up, subject to human error or bias. And second, your assumptions are difficult to change — with every exception built on top of every rule, it becomes extremely expensive to adjust your initial assumptions, akin to starting over.
In contrast, a learning system is an iterative process designed to discover the ideal program to drive a car. Instead of starting by making assumptions about how to drive, you start by observing how people actually drive in the real world. This data — capturing what people see and what they do next — serves as ground truth for your models. You then use machine learning techniques to discover which features in the environment actually impact driving decisions, and how people best navigate the world safely. This process is highly iterative, testing a broad selection of potential inputs and measuring model performance against more real-world data. You are essentially developing a driving model in reverse, starting with the right answer, and then using math to discover how people get to that answer.
The obvious advantage here is flexibility. When engineers try to write driving rules in robotics, they are guessing. No one knows exactly what features or inputs in a scene influence our driving behavior. With learning systems, the guesswork is removed — you can discover the most important inputs and the correct behavior by observing lots of real-world driving. A system that can rapidly hypothesize, train and measure is no longer constrained by a list of rules; any array of signals (e.g. objects, classifiers, sign recognizers, road rules from HD maps) can be tested to find the optimal mix. This is probably the single greatest benefit of the new machine learning age — allowing computers to find patterns in the world that people have failed to fully describe with linear programming. The flexibility to test a lot of variables leads to a superior product.
But Doesn’t Waymo Use Machine Learning?
An aside: there is a popular myth about machine learning in robotics-based autonomous vehicles. Almost all autonomous vehicle systems use machine learning for perception and classification of mobile objects, essentially determining the difference between a bicycle and a bunny on the street. But the learning system stops at a wall of conventional code when it comes to making the actual driving decision (also known as “the hard part”). In robotics, driving (or “planning”) is just a list of conventional rules like we have been writing in software for 50+ years.
Raised By the Streets
Ultimately, autonomous vehicles are judged by their performance on the street. The learning system again has a distinct advantage over robotics in its ability to deliver something we can actually use safely and reliably in the real world.
Learning systems start in the real world. The models are trained by huge sets of real-world data, and then tested against even bigger sets of real-world data held back for this exact purpose. By the time a model hits the road, you can rapidly prove its efficacy over millions—or even billions—of miles before you ever put it in a car.
Robotics cannot make that same claim. These machines can only test their software by driving around in the real world. It is extremely expensive and time consuming to test just a thousand miles, let alone millions of miles in the real world. Plus, you’re constantly having to start testing from zero with every software change. Previous experience driving around has all been conducted with old software, which cannot prove the performance of your new software.
Machine learning is turning the software world upside down, dramatically outperforming decades of work in linear programming and robotics. It will reinvigorate the possibility of real self-driving, changing our everyday lives in the next few years. I’m optimistic.