Engineering

WarpStream Blog

Tiered Storage Won’t Fix Kafka

Apr 28, 2024
Richie
Tiered storage is a hot topic in the world of data streaming systems, and for good reason. Cloud disks are (really) expensive, object storage is cheap, and in most cases, live consumers are just reading the most recently written data.

Cloud Disks are (Really!) Expensive

Apr 20, 2024
Richard Artoul
Cloud disks are expensive. Really expensive. Most engineers intuitively understand this, but the magnitudes are worth considering.

The Original Sin of Cloud Infrastructure

Mar 14, 2024
Richard Artoul
‍The troubled past of open source cloud infrastructure, and why we raised $20M to try and fix it.

Deterministic Simulation Testing for Our Entire SaaS

Mar 12, 2024
Richard Artoul
How we leverage Antithesis to deterministically simulate our entire SaaS platform and verify its correctness, all the way from signup to running entire Kafka workloads.

Kafka as a KV Store: deduplicating millions of keys with just 128 MiB of RAM

Mar 4, 2024
Manu Cupcic
A huge part of building a drop-in replacement for Apache Kafka® was implementing support for compacted topics. The primary difference between a “regular” topic in Kafka and a “compacted” topic is that Kafka will asynchronously delete records from compacted topics that are not the latest record for a specific key within a given partition.

Anatomy of a serverless usage based billing system

Feb 8, 2024
Richard Artoul
Serverless products and usage based billing models go hand in hand, almost by definition. A product that is truly serverless effectively has to have usage based pricing, otherwise it’s not really serverless!

S3 Express is All You Need

Nov 28, 2023
Richard Artoul
The future of modern data infrastructure is object storage.

Unlocking Idempotency with Retroactive Tombstones

Nov 18, 2023
Richard Artoul
How we separated data from metadata to build support for idempotent producers in our Apache Kafka protocol layer.

Minimizing S3 API Costs with Distributed mmap

Oct 9, 2023
Richard Artoul
We first introduced WarpStream in our blog post: "Kafka is Dead, Long Live Kafka", but to summarize: WarpStream is a Kafka protocol compatible data streaming system built directly on top of object storage.

Hacking the Kafka PRoTocOL

Sep 18, 2023
Richard Artoul
How we built stateless load balancing into a protocol that was never designed for it.

Kafka is dead, long live Kafka

Jul 25, 2023
Richard Artoul
The problems with Kafka, and how we created WarpStream to solve them.