Discussion about this post

User's avatar
jin's avatar

Great summary, thanks

Expand full comment
Kolby Madison's avatar

DoorDash’s data stack is seriously impressive. I like how they mix open-source tools like Kafka, Flink, Spark, and Pinot with AWS to handle massive amounts of data.

Building a Lakehouse on S3 and Delta, plus using Trino, Airflow, and Sigma, shows how carefully they planned for scale and flexibility. Having 12,000 Sigma users is impressive!

Managing real-time and batch like that isn’t easy.

If you were starting from scratch, which tool would you prioritize first?

Expand full comment
1 more comment...

No posts

Ready for more?