Exclusive: Waymo engineering exec discusses self-driving AI models that will power the cars into new cities
When I posted on Instagram in May about my first ride in a self-driving, AI-powered Waymo car in San Francisco—a video of an empty driver’s seat with the steering wheel making a smooth left turn on a busy street—the responses ranged from “Yikes” and “That is CRAZY” to “No no no.”
I’ll admit, I was a bit freaked out myself as the Waymo, a standout, white, all-electric Jaguar, outfitted with rotating sensors on the roof, ferried me up and down one of San Francisco’s steepest streets. But I was pleasantly surprised (and relieved) to watch the autonomous driver pause fully yet gently at each and every stop sign along the route. The Waymo was a defensive driver, but not too slow; it steered carefully, but ventured into traffic with conviction. I finally exhaled and relaxed.
Waymo’s polite chauffeur seems almost like magic. The ride left me wondering: How exactly did my Waymo manage to do this without crashing? Recently, Waymo gave me a chance to find out as it prepares to expand to Atlanta and Austin, adding to its fleets in the Bay Area, Los Angeles and Phoenix.
Waymo, a subsidiary of Alphabet that began as the Google Self-Driving Project in 2009, only started offering rides to the public without human safety drivers within the past few years. In August it announced that it’s serving over 100,000 paid rides per week in the handful of U.S. cities it operates in. But as Waymo expands to new cities, it needs to convince tens of thousands of new riders to get over the unsettling feeling of riding in a driverless taxi.
While Waymo has been somewhat circumspect about exactly how its AI-powered self-driving tech works in the past, the company now believes that offering a closer look under the proverbial hood is important for autonomous cars to gain wider acceptance. Srikanth Thirumalai, Waymo’s head of engineering for onboard technology, told Fortune that prioritizing safety in its messaging—rather than focusing on AI—has been crucial for building trust with potential riders. (Naturally, all self-driving vehicle companies must also navigate regulatory scrutiny.)
“We don’t want to take the focus off of what we’re actually trying to do here,” Thirumalai told Fortune in his first interview since joining Waymo a year ago after 18 years at Amazon. “We have to lead with ‘hey, we are developing this technology responsibly.’”
But helping more people understand how its AI-powered system works, he explained, is the next stage in the company’s 15-year effort to build “the world’s most trusted driver.”
“Sharing more information about our tech and its safety is essential for building trust with riders and communities in which we operate,” he said.
JASON HENRY/AFP via Getty Images
Beyond the hype of Generative AI
While Waymo has pulled ahead of the self-driving competition for the time being, autonomous cars are still very much a work-in-progress. Rivals from GM, Amazon, and Tesla to software developers like Wayve, are pouring billions of dollars into developing their own systems. And regulators are keeping a close eye on the robo-cars currently roaming the roads in designated areas.
In February GM’s Cruise had its permit yanked by the California DMV last year after an accident in which one of its San Francisco robotaxis struck a pedestrian who had been thrown into its path by another car, dragging the injured woman for 20 feet. The company grounded its fleet across the U.S., and has only recently began testing again in some cities with safety drivers behind the wheel.
Waymo has so far avoided serious incidents, but it has made its share of headlines: In May, an unoccupied Waymo taxi hit a telephone pole in Phoenix, Arizona, leading Waymo to issue a voluntary recall and update the software of its entire fleet of 672 autonomous vehicles. In August, there were reports of Waymos honking at each other in a San Francisco parking lot, disturbing neighbors (Waymo said it was an “unintended consequence” of a safety feature meant to avoid low-speed collisions when a car is backing up.) And last week, a Waymo came to a halt next to a San Francisco bus right outside of a Y Combinator event, and a video went viral of several tech CEOs attempting to move it (A Waymo representative said that “the bus’s rear door made contact with the side of our vehicle and was unable to close. We dispatched our roadside assistance team to retrieve the vehicle, and before they arrived, bystanders rocked our vehicle free of the door, so the bus could proceed.”)
Thirumalai, who focused on AI-powered search and shopping at Amazon before joining Waymo, emphasized his excitement about the challenge of working to power products that function safely and reliably in real-world situations. In fact, it is one of the reasons he joined Waymo. At the same time, he added, it is a humbling way to get beyond the current hype around generative AI.
Self-driving cars, he explained, present an extreme “long-tail” learning problem–where events that are rare and unexpected are also, collectively, numerous and a top priority to address. These vehicles require AI that “generalizes really well,” he added, so it can handle both relatively common and predictable situations, like stopping at a red light or yielding to pedestrians, but also surprising or unusual scenarios such as a person in a wheelchair crossing the road at night, or a tree toppling down into the road, or even a herd of circus animals escaped from the back of truck (OK, I made that last one up, but that’s the point — the AI needs to be ready for anything).
Waymo’s dual AI models — on-board and in the cloud
To tackle both predictable and “long-tail” driving situations, Waymo’s current technology stack in its cars — as well as a next-generation version still being tested and not yet available to riders — begins with dozens of on-board sensors that allow the car to visualize its environment and provide comprehensive data to help its AI system make real-time decisions.
These sensors include radar, high-definition and other video cameras, and external audio receivers, as well as roof-mounted LiDAR (Light Detection and Ranging) sensors that generate real-time, 360-degree, 3D views and provide depth perception. The array of sensors give the Waymo Driver system overlapping fields of view so that it can simultaneously observe objects, obstacles, or terrain features up to 300 meters away, from different perspectives (the next generation system will have a range of 500 meters on a clear day, Waymo says).
The sensors gather data from various scenarios, collected during every trip a Waymo takes. The company also uses synthetic data to train Waymo with simulations of a broader diversity of situations—such as weather conditions—than it might encounter on the roads of Phoenix or San Francisco.
Mario Tama/Getty Images
Waymo has developed a large-scale AI model called the Waymo Foundation Model that supports the vehicle’s ability to perceive its surroundings, predicts the behavior of others on the road, simulates scenarios and makes driving decisions. This massive model functions similarly to large language models (LLMs) like ChatGPT, which are trained on vast datasets to learn patterns and make predictions. Just as companies like OpenAI and Google have built newer multimodal models to combine different types of data (such as text as well as images, audio or video), Waymo’s AI integrates sensor data from multiple sources to understand its environment.
The Waymo Foundation Model is a single, massive-sized model, but when a rider gets into a Waymo, the car works off a smaller, onboard model that is “distilled” from the much larger one — because it needs to be compact enough in order to run on the car’s power. The big model is used as a “Teacher” model to impart its knowledge and power to smaller ‘Student’ models — a process widely used in the field of generative AI. The small models are optimized for speed and efficiency and run in real time on each vehicle—while still retaining the critical decision-making abilities needed to drive the car.
As a result, perception and behavior tasks, including perceiving objects, predicting the actions of other road users and planning the car’s next steps, happen on-board the car in real time. The much larger model can also simulate realistic driving environments to test and validate its decisions virtually before deploying to the Waymo vehicles. The on-board model also means that Waymos are not reliant on a constant wireless internet connection to operate — if the connection temporarily drops, the Waymo doesn’t freeze in its tracks.
Ultimately, Thirumalai explained, Waymo’s AI system is able to choose the trajectory it believes is the best given the situation. (Waymo would not share specific specs about the models, such as the number of parameters or other details, citing confidentiality.)
“I’ve seen the future and would be stupid not to be a part of it”
Waymo’s AI system is not the only approach companies are using to tackle self-driving. For example, Wayve, a UK-based startup backed by Microsoft and Meta’s chief scientist Yann LeCun, does not use LiDAR (though it uses high-definition cameras), instead relying on the kinds of cameras and ultrasound sensors that already come as standard in many vehicles, and is heavily-focused on developing a single generative AI “world model” that interprets visual data and makes driving decisions as one integrated system. Tesla’s autopilot system, which is not yet capable of unsupervised self driving, eschews Lidar and relies on sensors and a suite of eight cameras that provide a 360-degree view around the vehicle. As the automaker pushes towards unsupervised self driving, it has begun testing a new AI system based on neural networks.
Wayve, however, does not have hundreds of self-driving robotaxis on the road, like Waymo does – instead, it is developing its software products to deploy in vehicles built by major automotive manufacturers. And Tesla’s robotaxi reportedly won’t be on the road for at least another couple of years. Waymo’s approach, on the other hand, “works backwards from the problem we’re trying to solve, which is, how do you actually get these cars in the real world?”
Justin Sullivan/Getty Images
For Thirumalai, the opportunity to help Waymo use AI to reach its safety goals was too good to pass up. “I was happy at Amazon and wasn’t really looking for a change,” he said. “Waymo came along and I was just floored by the team, their mission, what they’ve accomplished so far—it was clear to me that with this tailwind of AI, they were really going to to grow and be a huge force of change for the world.”
While I may have had anxious moments going up that steep San Francisco street, Thirumalai had no such concerns. His own first-ever Waymo ride, taken during his interview with Dmitri Dolgov, Waymo’s co-CEO, “blew my mind,” he said.
The Waymo, he said, navigated morning traffic downtown San Francisco and made it up Telegraph Hill all the way to Coit Tower overlooking the city and the bay. To be clear, there is no army of human wizards behind the Waymo curtain helping the car go: “That would be extremely difficult for us to scale, with the millions, millions of miles we drive every, every month,” he said, though he emphasized human remote operators are available to step in if a Waymo car gets stuck and needs additional context.
As Thirumalai’s Waymo safely navigated narrow streets and anticipated pedestrians and even dealt with double-parked cars, he realized he was experiencing a very special ride. “I’m like, these people have got us to a point where this is driving just as well, if not better, than a human–and they’re doing all this just as AI is taking off,” he said.
When Thirumalai returned from his Waymo ride, he told his wife about his experience. “I said, my God, I’ve seen the future,” he recalled. “And I would be stupid not to be a part of it.”