At StarTree, we've taken a novel approach to querying Iceberg tables: applying Apache Pinot-style indexes to improve performance, lower costs, and increase concurrency… without moving data into a separate system.
In our tests on a ~1TB dataset, this reduced query latency significantly:
* 500+ QPS with sub-second latency
* Complex aggregations <650ms
* ~5–37x faster than ClickHouse and ~4–17x faster than Trino in the same setup
We welcome comments and input from this community!
At StarTree, we've taken a novel approach to querying Iceberg tables: applying Apache Pinot-style indexes to improve performance, lower costs, and increase concurrency… without moving data into a separate system.
In our tests on a ~1TB dataset, this reduced query latency significantly:
* 500+ QPS with sub-second latency * Complex aggregations <650ms * ~5–37x faster than ClickHouse and ~4–17x faster than Trino in the same setup
We welcome comments and input from this community!
This is really cool stuff. Amazing the efficiencies Pinot brings to data systems.
[Disclosure: I used to work at StarTree. Still a fan!]
[dead]