Data Engineering Prompts
1. Learn Any Concept Deeply
Explain Apache Spark like I’m a mid-level data engineer working on production systems. Cover architecture, execution flow, lazy evaluation, DAGs, shuffles, partitioning, caching, optimization techniques, and real-world bottlenecks with examples.
2. Real Production Problems
Give me 10 real production issues faced by data engineers in Databricks and how experienced engineers debug and solve them.
3. Architecture Thinking
Design a scalable ETL pipeline for processing 10TB/day of data from Kafka into S3, Snowflake, and Elasticsearch. Include monitoring, retries, schema evolution, partitioning, and cost optimization.