Ghost Autonomy develops autonomous driving software for consumer cars. The company has developed a new autonomy stack designed for the mass market, with a new approach to perception and planning that can run on scalable, commodity hardware.
One of Ghost’s core innovations is a new perception system built around AI-based stereo vision, which uses a new application of neural networks and pairs of cameras to deliver an instantaneous, high-resolution depth field in every direction that eliminates the need for lidar.
Obstacle Detection – The Status Quo
One of the fundamental building blocks of driving is detecting and ranging obstacles on the road.
People do this with their eyes, using combination of both stereo and mono vision to see obstacles, and roughly estimate their distance, relative velocity and motion direction. While people are not especially precise estimating distance, they are extremely good at responding to relative velocity and almost never miss obstacles when paying attention.
Driver assistance and automated driving systems use a combination of cameras, radar and sometimes lidar to deliver a reliable detection and ranging solution across a broad range of lighting and weather conditions.
- ADAS – Less capable driver assistance systems use a combination of mono cameras and radar for detection and ranging. Mono cameras use neural networks to recognize objects, and then use rough size approximation to estimate distance. These measurements are fused with radar returns to confirm obstacles in the depth field.
- Robotaxi – More capable robotaxis add lidar to this mix for increased resolution and accuracy, providing a much richer depth field visualized as a point cloud. Lidar is a specialized sensor that sends out pulses of light and measures the time for the reflected light to return to the receiver. While highly accurate, this sensor has not yet scaled to the commercial mainstream, with both technical and physical limitations that make it difficult to integrate in everyday cars.
The Challenges with Automotive Lidar – A Costly One-Trick Pony
While lidar has proven effective for measuring distance, it has not yet been proven as necessary to deliver advanced autonomous driving, with a few obvious and well-understood drawbacks:
- Insufficient Resolution, Spectrum – Lidar is insufficient as a primary sensor for driving – it has no ability to see color, meaning it can not see lanes, signal lights, sirens or read signs. Cars still need vision to understand a scene, meaning lidar returns must still be fused with vision (and almost always radar as well.)
- Poor Performance in Weather – Lidar performance degrades in heavy rain, snow or fog, with similar issues as cameras. Camera and lidar are not sufficient on their own - an additional radar sensor is still necessary to navigate bad weather conditions.
- High Cost – Lidar is not yet cost competitive, still 10x more expensive than camera and radar solutions that have been in the consumer market at scale for 50+ years.
- High Power Consumption – Lidar uses significantly more power than cameras, adversely impacting fuel economy or EV range. Where a camera sensor draws <2W, a lidar unit can draw anywhere from 20-50W.
- Unidirectional – While some robotaxis leverage 360° spinning lidar, lidar deployments in consumer vehicles use unidirectional solid state lidar. The majority of initial consumer lidar deployments are forward-only, as 4 lidars would be required to achieve 360° coverage around the car. Moreover, lidars can not be hidden behind panels, so they do not fit neatly into existing vehicle envelopes.
A reliable, dense depth field is useful for autonomous driving. If this were available via a camera or radar sensor, lidar would no longer be required, avoiding the complexity and expense of a superfluous sensor.
A New Stereo Vision Solution for Ranging
Ghost has developed a new vision-based perception system for obstacle detection, ranging, and velocity measurement with stereo cameras.
Stereo cameras are based on the principles of binocular vision, backed by more than 100 years of technological development and millions of years of evolutionary biology. Two camera sensors capture the same scene from slightly different angles, and triangulate distance by calculating the disparity between the same object in the two images.
The math required to calculate disparity (and thus distance)is computationally expensive, presenting a significant technical barrier for the high-resolution, real-time requirements to support driving. Ghost solved this problem by developing neural networks that approximate stereo math with high speed and precision but very little compute, enabling instant depth measurements for every single pixel in a scene.
The result is a vision-based system that can deliver a highly reliable, high resolution depth field in every direction to support attention-free autonomous driving, without requiring any additional purpose-built sensor:
- High Resolution – Ghost’s stereo-vision works with 8MP cameras, delivering per-pixel depth for 8 million pixels in every scene, in 12-bit color.
- Instantaneous Depth –Ghost stereo-vision returns per-pixel depth in <10ms, enabling a high-frequency planning and control system than recalculates a new drive plan 30x / second.
- Low Cost – Ghost stereo-vision system is based on commodity 8MP camera sensors, making it possible to enable 360° stereo vision with just 4 camera pairs. These same cameras can simultaneously support other computer vision algorithms, e.g. monocular neural networks for road marker detection.
- Low Power Consumption – Ghost stereo-vision system uses cameras that draw <2W, with the entire 360° implementation using less than 12W.
No Recognition Required – A Safety Breakthrough for Vision
Ghost’s stereo-vision implementation achieves per-pixel depth in a scene with a camera, eliminating the need for object recognition required for mono camera depth estimates.
This implementation dramatically increases the safety and reliability of vision-based systems by eliminating the long-tail errors associated with object recognition. Some of the most publicized accidents in self-driving have been due to recognition errors, where vision-based systems failed to detect a fire truck or overturned semi-truck in their way because they did not recognize the object.
No longer does a vision-based system need to recognize an obstacle to see it and ultimately avoid it. Instead Ghost’s stereo-vision simply detects clusters of pixels, no matter what they are, and flags them as obstacles in a scene, including relevant distance and velocity measurements.
With stereo-vision, Ghost delivers universal obstacle detection capabilities that are robust to the long-tail of obstacles and events, dramatically improving collision avoidance and related safety outcomes on the road.
Bringing Autonomy to the Volume Market
From the beginning, Ghost has designed its autonomous driving system for scale, targeting the volume market of cars that people drive every day to work.
While many other companies have solved challenging problems with expensive hardware, Ghost has invented and rebuilt new solutions based on software that can scale to a much broader range of cars at significantly lower price points.
Ghost’s novel approach to stereo-vision embodies this belief, using inexpensive mobile cameras and chips and new applications of neural networks to eliminate the need for lidar.
This breakthrough approach to perception is just one of many new technologies Ghost is bringing to automakers to chart a new path forward for consumer autonomy.