Name: WarpStream
Brand: WarpStream
Availability: InStock

Source Topic

tableflow_json

Table UUID

7766aaf0-16f6-4a9b

Table Path

warpstream/tableflow/

Bucket URL

s3://bucketurl

airflow_dag.json

ERROR: orphan files detected

table_repair.hql

cleanup.yml

WARN: schema drift

manual_compaction.sh

WARN: compaction lagging behind

airflow_dag.json

ERROR: orphan files detected

table_repair.hql

cleanup.yml

WARN: schema drift

manual_compaction.sh

WARN: compaction lagging behind

airflow_dag.json

ERROR: orphan files detected

table_repair.hql

cleanup.yml

WARN: schema drift

manual_compaction.sh

WARN: compaction lagging behind

airflow_dag.json

ERROR: orphan files detected

table_repair.hql

cleanup.yml

WARN: schema drift

manual_compaction.sh

WARN: compaction lagging behind

With Tableflow

Without Tableflow

Tableflow is the easiest, cheapest, and most flexible way to convert Kafka topic data into Iceberg tables with low latency, and keep them compacted.

Auto-Scaling

Automatic Retention

Automatic compactions

Automatic table maintenance

Custom partitioning

Low Latency

Zero Ops,
Zero Access

WarpStream Tableflow follows the same Zero-Access BYOC design principles as WarpStream for Kafka, meaning that your raw data never leaves your environment.

Raw data is processed on your VMs, stored in your object storage buckets, and that raw data is never accessible by any third party (including us!).

Plug & Play

Attach to any existing Kafka cluster

WarpStream Tableflow works with any Kafka-compatible source (Open-Source Kafka, MSK, Confluent Cloud, WarpStream, etc), and can run in any cloud or even on-premise. Ingest simultaneously from multiple different Kafka clusters to centralize your data in a single lake.

Compatible With

GCP BigQuery

AWS Athena

DuckDB

ClickHouse

AWS Glue

Trino

warpstream — zsh

Made for Developers
Try our demo in under 30 seconds

curl https://console.warpstream.com/install.sh | sh

Made for Developers

Give Us A Spec

We’ll Take Care
of the Rest

Define your Iceberg table configuration in a declarative YAML file, then sit back and relax as the WarpStream Agents connect to your Kafka cluster and start creating Iceberg tables to your exacting specification.

Get Started

Early Access

tables:
   - source_cluster_name: "benchmark"
     source_topic: "example_json_logs_topic"
     source_format: "json"
     schema_mode: "inline"
     schema:
       fields:
         - { name: environment, type: string, id: 1}
         - { name: service, type: string, id: 2}
         - { name: status, type: string, id: 3}
         - { name: message, type: string, id: 4}
   - source_cluster_name: "benchmark"          
     source_topic: "example_avro_events_topic"
     source_format: "avro"
     schema_mode: "inline"
     schema:
       fields:
           - { name: event_id, id: 1, type: string }
           - { name: user_id, id: 2, type: long }
           - { name: session_id, id: 3, type: string }
           - name: profile
             id: 4
             type: struct
             fields:
               - { name: country, id: 5, type: string }
               - { name: language, id: 6, type: string }

Iceberg-Native Database

Tableflow is not just a connector or “zero-copy” version of Kafka tiered storage. It’s a magic, auto-scaling, completely stateless, single-binary database that runs in your environments, connects to your Kafka clusters, and manufactures Iceberg tables to your specification using a declarative YAML configuration. Tableflow is to Iceberg-generating Spark pipelines what WarpStream is to Apache Kafka.

The Iceberg tables created by Tableflow are fully-managed, which means that ingestion, compaction, table maintenance, and all other operations are handled by WarpStream automatically. In addition, Tableflow allows you to configure custom sorting and partitioning schemes for your data, enabling faster queries and lower costs.

Comparison

Apache Spark

Built-in / “Zero Copy” / “Tiered Storage”

Connector-based solutions

WarpStream Tableflow

Auto-Scaling

Automatically Enforces Retention

Automatically manages compactions and table maintenance

Custom partitioning

Compatible with any Kafka-compatible source

Ingest from multiple different Kafka clusters at the same time

Read The Deep Dive

The Case for an Iceberg-Native Database: Why Spark Jobs and Zero-Copy Kafka Won’t Cut It

FAQs

Don't see an answer to your question? Check our docs, or contact us directly.

Get Started

Contact

How do I request access?

Tableflow is currently available as an Early Access (EA) feature. Please contact us if you'd like to register to be included in the Early Access program.

Does WarpStream Tableflow require using WarpStream topics as the source?

No, Tableflow is vendor agnostic and can be used with any Kafka-compatible topic source (like open-source Kafka, MSK, Redpanda, Confluent, etc.). You do not need to use WarpStream as your source system to leverage Tableflow. In fact, you can use multiple source systems or clusters to ingest data into Tableflow.