Ghost’s first product is a L4 highway pilot, capable of autonomously driving its passengers in both stop-and-go traffic and cruising speed on the freeway. We are actively testing the Ghost Autonomy Engine on the freeway seven days a week, driving 1,000s of miles per month autonomously. Let’s take a look at a typical drive and give you a bit of insight into how Ghost “sees”, “thinks”, and ultimately turns accurate perception into safe self-driving.
To give you a view inside Ghost, we created a visualization dashboard that we call “Ghost View” — a mash-up of data from the cameras, radar, neural networks, and drive program that comprise Ghost. Please take a look at this quick Ghost View video excerpt of an everyday Ghost highway test, and we’ll explain more about what you are seeing and how it all works below.
Before we talk tech, let’s explore a bit more detail on the scenario you are seeing, and how it is visualized in Ghost View.
This particular drive is US-280S in the Bay Area in the afternoon, and is just one example of typical Ghost road testing that we perform day in and day out. While engaged, Ghost is driving the car autonomously, while the test is being overseen by both a test driver and a safety co-driver in the car. Ghost is using front stereo cameras and HD radar for perception, and both the perception pipeline and driving program are running on a small Driving Computer in the trunk. All Ghost vehicles are equipped with side and rear stereo vision cameras, however this data is not used by the drive planner at this time, which currently limits Ghost to single-lane driving. Side and rear sensing will be enabled soon, which will widen our perception and add the ability to change lanes, as well as more comfortably handle scenarios such as lane encroachment from the side. Objects and lanes that are sensed are displayed in Ghost’s HMI with a few current limitations: first, only objects in the ego lane ("ego" is AV speak for the Ghost-equipped car) and one lane on each side are visualized, second, the HMI only displays side vehicles once they are slightly in front of the ego vehicle, and third, objects are only visualized in the HMI to a distance of ~70m currently. All of these limitations will be resolved in the production version of Ghost.
Introducing “Ghost View”
The three-panel “Ghost View” visualization that you see above also probably could use a bit of explanation. This visualization is created by combining a live camera feed from inside the car (giving you a view of the safety driver, Ghost HMI, and of the driving scene looking out the front windshield of the car as a driver normally would). But in addition to this view, there are two additional views that show what Ghost is “seeing” and “thinking”.
The bottom-left view is a stereo disparity view, where the colors of each pixel represent the disparity (used to calculate distance) between the R and L front cameras. This is one of the direct outputs of the KineticFlow neural network and it is heavily down sampled in this output video. If you look at the road, you will see a smooth gradient of color moving away from the vehicle, as each vertical pixel represents a further point on the road. When a vehicle is detected, you’ll see a constant color that seems to rise up from the road, since the back of a vehicle, for example, is at a single distance for its vertical height.
Also at the bottom-left you’ll see some operational stats about the vehicle… the current steering, throttle, and brake inputs that either Ghost (or the human test driver) are executing in the car at that moment, the speed of the vehicle, the autonomy state (Ghost driving or not), and whether the safety driver is currently overriding Ghost by using the pedals or wheel.
At the bottom-right, you’ll see a view of the lanes and relevant objects that Ghost has detected and is using to drive in the scene. The ego lane is shaded green, with a line down the center indicating the detected center of the lane, and the +1/-1 lanes are shaded light gray. If you’ve seen similar views from other autonomy companies you’ll often see square or cuboid “bounding boxes” placed around objects, indicating that objects have been visually identified using AI-based image recognition. Ghost’s KineticFlow AI works differently, sensing objects universally using physics cues based upon surfaces found in the scene rather than classic image recognition. This view visualizes relevant detected objects with a target icon, with the distance to the object listed below the target. Finally, this same data is shown in a top-down view on the right side of the screen, with some additional data showing relative velocity between each object and the ego vehicle.
How Ghost Drives
OK, now that you understand what you’re seeing, let’s discuss some of the technologies behind how Ghost drives on the freeway. This blog will be woefully inadequate at explaining how the entire Ghost Drive Program works, but hopefully it will give you a sense of some of the techniques that Ghost uses for highway autonomy. At a high level, we’ll explore lane/scene detection, actor/obstacle detection, actor/obstacle distance/velocity measurement, ambient traffic speed detection, and drive path calculation.
How Ghost Detects Lanes, Understands Scenes
Although everything happens more-or-less in parallel, it’s helpful to think about the first step being understanding the scene itself. For a relatively simple single lane-keeping scene like this, that basically means understanding the ego lane position, road curvature, and lane banking. Understanding the lane position is obviously critical for deciding where to drive, understanding curvature is equally important for not only driving, but also for deciding whether or not a detected actor is in your lane in the distance or not.
Ghost uses the KineticFlow visual neural network to discover objects/actors, road features and lane markers on the road. It also uses a scene canonicalization neural network to discover lanes, using several inputs including visible lane markers, the placement of detected vehicles, and observed vehicle driving paths. The scene canonicalization network also leverages a near term history of prior lane geometries including lane boundaries as a predictor for where the lane should be to predict lane continuity for missing road markers. This combination of signals helps Ghost understand lanes even when lane markers are obscured, or when challenging/confusing lane markers exist, say after construction. This is particularly important in stop-and-go traffic, as there are often many vehicles in the scene, and thus many obscured or occluded lane markers that the network must predict to form a robust lane boundary.
How Ghost Detects Actors and Obstacles
Ghost uses both vision and radar signals to detect objects. KineticFlow, Ghost’s visual neural network, detects objects universally, using physics-based cues instead of image-based object recognition. You can read a whole blog about this, or watch this short tech talk video. Ghost also receives object detections from radar, giving it two different modalities to detect relevant objects. This is particularly important in stop-and-go traffic situations, as radar can have difficulties detecting stopped objects, a scenario where visual-based detection shines. Another advantage to Ghost’s perception stack in stop-and-go traffic is that by not requiring object recognition, it can detect partial objects very quickly… think of a car in a line of cars that decides to pull into your lane – Ghost doesn’t need to see the entire car to start taking action, as soon as the bumper pulls out and is visible, it is discovered as a new object that needs to be acted on.
How Ghost Measures Distance and Velocity
Just as Ghost uses both vision and radar to detect objects, it uses both technologies to redundantly measure object distance and velocity. Radar provides the fastest and most precise measurement of relative velocity and is a good predictor of distance for far away objects as a primary sensor modality. KineticFlow’s stereo vision neural network provides highly accurate distance measurements, particularly in the medium and near field. KineticFlow can also measure object velocity by calculating motion paths over multiple frames of images, albeit a bit slower as it requires multiple frames. Ghost merges these independent vision and radar measurements in the Drive Program where they are considered alongside the confidence level in each measurement.
How Ghost Detects Ambient Speed
Another tricky feature of freeway driving is that multiple lanes on a multi-lane freeway may exhibit very different speed characteristics, and these vary in time quickly. Ghost uses radar and visual detection to understand the ambient speed and density of vehicles in the scene, on a per-lane basis. For example, if traffic is stopped on both sides of the ego lane, and an actor in front of the ego car changes lanes to reveal a stretch of open road in front of the ego car, it probably is neither safe or comfortable to quickly accelerate, a more “normal” driving behavior would be to gently move along with traffic and expect the ego lane to slow again. On the other hand, of the lane to the right is stopped, the lane to the left is cruising, and the ego lane becomes free, it is more acceptable to pick-up speed in the ego lane. This is a complex topic that warrants a follow-on blog to itself, but suffice it to say that driving comfortably requires understanding traffic flow dynamics around the vehicle on all sides, not just in-lane.
How Ghost Determines the Drive Path
After all the lane, scene, actor, and traffic flow perception, it’s finally time to drive. As a default motion, Ghost senses the vehicle in-front, and seeks to follow that vehicle at an appropriate speed and distance that is relative to the in-lane and ambient flow of traffic. This alone is a good deal of work, as accurate perception and a detailed understanding of the specific vehicle kinematics are required to drive smoothly. Once safe and smooth lane/vehicle following is achieved, a second goal is comfortable lane placement. This means giving space to larger vehicles, and deviating position within lane to add comfort/safety when the specific placement of vehicles in the scene allows due to encroachment by a vehicle in neighboring lane (Note: lane deviation is a future Ghost feature that will come as side sensing is enabled). Finally, the drive plan, of course, gets more interesting under challenging traffic conditions, where vehicles cut-in and perform more aggressive lane changes in front of the ego vehicle, a situation that happens frequently in stop-and-go traffic. This is a situation where the fast and universal perception of Ghost’s vision and radar stack shines, giving Ghost accurate information as quickly as possible. It’s also where the real-time nature of Ghost’s drive planning, which re-calculates the drive plan 30 times-per-second, allows for quick reactions and evasive, yet comfortable maneuvers.
Hopefully this visual peek inside the Ghost machine gives you a better sense for how Ghost drives. Ghost’s unique combination of redundant vision and radar-based perception, robust lane sensing, and quick reaction time makes it well-suited to highway autonomy with safety and comfort, enabling an attention-free commute. Look for future posts where we’ll share how Ghost performs more challenging scenarios, as well as break-down the details of every step in the driving pipeline.