WarpStream is building a cheaper, cloud-native data streaming service
When the open source streaming service, Apache Kafka, was created in 2011 at LinkedIn, it was a different world. Most companies were still on prem. The notion of cloud computing was just beginning to emerge. WarpStream, an early stage startup, sees the value of streaming in a cloud native context and built a new solution from the ground up based on the Apache Kafka protocol, but designed for the cloud age.
Today the company announced a $20 million investment.
WarpStream CEO Richard Artoul says that he and his co-founder, CTO Ryan Worl, found in their previous jobs that moving data into Kafka was complex and expensive and they wanted to change that. “If you were building something today that looked like Apache Kafka like we did with WarpStream, you would take a really different approach than was taken back in 2011 when Kafka was first designed, and so that’s why we think now’s a really good time for us to build something new that can actually meaningfully drive down costs and operational burden for people,” Artoul told TechCrunch.
The way they do that is by taking advantage of today’s cloud environment to separate compute from storage using an object storage service like Amazon S3. By taking this approach, they have been able to eliminate inter-zone networking costs, which often represent 80% or more of the total cost of running large scale Kafka workloads, according to the company.
“When you interact and store data in cloud object storage you get to sidestep all these networking fees that plague these big data systems when they get lifted, shifted into the cloud,” he said. “And a lot of the really hard problems around data durability and replication that Kafka had to solve on its own by copying the data three times, replicating it and making sure that data was never lost, we’re able to offload those problems to the object storage layer itself, and that ends up making the system a lot easier and cheaper to operate.”
Artoul and Worl were working together at Datadog when they helped develop a storage system called Husky. Today, if a DataDog customer is searching through their application logs, they’re actually using Husky. Datadog was also a big Kafka user. “Based on our experience building the kind of storage system on top of object storage we had built at DataDog, we felt like streaming systems should work the same way. And so last year we left Datadog to start working on it,” he said.
They are taking two approaches, one where customers can essentially bring their own cloud and install WarpStream, and one where they offer a fully managed serverless option. The BYOC version is available starting today. The company has also included a calculator right on the pricing page to figure out how much it will cost to run WarpStream.
The founders brought some of the folks who helped build Husky to build the new system, and today they have 9 employees. The good news is that they are hiring and hope to double head count by the end of the year.
The $20 million investment was led by Greylock and Amplify Partners with some data industry luminaries also chipping in with angel investments.