Cursor is an AI-powered integrated development environment (IDE). Its large language models (LLMs) provide functionality like autocomplete and chat. Cursor can be used by those with little programming experience – as prompts can generate code and feature-rich apps – as well as by experienced software engineers to be more efficient with debugging, writing code, and other tasks as Cursor can function as an AI teammate.
The company has made headlines multiple times for its rapid growth – growing to $100 million in annual recurring revenue (ARR) in just a year and surpassing $500 million in ARR only 30 months after launching. Cursor’s Kafka-compatible data streaming infrastructure is powered by WarpStream.
Alex Haugland, an engineer at Anysphere (developers of Cursor), was interviewed by Confluent’s Joseph Morais for an episode of the Life Is But a Stream podcast to talk about why and how Cursor leverages WarpStream. Below, you can find the full podcast and Alex’s responses to questions about WarpStream.
Cursor investigated using OSK, but ultimately chose WarpStream, per several of the reasons that Alex highlights below.
If you look at WarpStream, you get to own your data, it sits in your S3 buckets, but you don't have to worry about managing a control plane, which is like by far the biggest pain in the ass of managing a system like that.
And you don't have to worry about scaling or storage or compute 'cause it's just like you get some amount of WarpStream nodes. You don't have to worry about routing between them. You just kind of have them. You have some amount of storage, which is in S3, which you never need to worry about scaling because it's S3.
Who cares? You don't have to worry about the relationship between the two of them. It's just like data go in, data go out. Like it's, it's like it's exactly kind of the abstraction that you want out of Kafka. You pay somebody a small amount of money, you get to keep your data. You don't have to think about it.
A common question we get about WarpStream is that since we’re sacrificing a small amount of latency for cost savings and operational simplicity, does the bit of latency actually matter, and what does that acceptance of a small amount of latency unlock?
Latency on WarpStream is pretty good. Most people don't really need 100-millisecond, 10-millisecond Kafka latency. It's just not important. At least for us, like we could use 10-, 20-, 50- second Kafka latency. We would never care; like we're mostly using it for slower stuff.
And if you're willing to accept just a little bit of latency, it makes so many things so much easier. You no longer have to worry about provision and compute and storage, like together.
You don't have to worry about scaling your stuff. You just kind of get this magical, like, ‘WarpStream, put stuff in S3,’ and I have some number of WarpStream nodes and I just don't ever think about it. And in exchange latency is like 500 milliseconds. Oh my God. Like that doesn't matter.
But the thing that we actually value a lot more about WarpStream isn't so much like the way that it scales or any of the performance stuff or even the cost, particularly. It's the fact that the way that it handles the data means that at the end of the day, it's in our cloud. So all of the data is stored in S3 in a place that we control.
WarpStream Bring Your Own Cloud (BYOC) is zero access and secure by default as the control and data planes are split.
WarpStream lets us do that [split] in a way where the data sits entirely inside stuff that we control and have access to. And so we don't have to worry about like, ‘Oh man, like if our data provider gets hacked, what are we gonna do? All the people's stuff is out there.' It's a huge, huge problem.
It's a betrayal of our user's trust. It's a serious thing. And being able to just say all this stuff is locked down in S3, we have control [of] it. There's just nothing going on there – [that] is really, really, really important to us.
WarpStream auto-scales automatically, so users never have to worry about over or under-provisioning.
In the case of the WarpStream side of things, right, you just don't think about it. It just scales up and you don't think about it. I think we've spent zero hours thinking about scaling WarpStream in the last few months. It's just like it's a solved problem. The servers in front of WarpStream that ingest the data and then queue it – they just kind of hang out.
People don't usually work on it [WarpStream]. It just kind of works. So they're, they're really kind of just ignorant to that substrate. Yeah. I'm trying not to be hyperbolic, but like people do not, I think, interact with the WarpStream-related stuff on a regular basis.