Staff Software Engineer - Distributed Data Systems
Develops distributed data systems like Apache Spark and Delta Lake at massive scale, ensuring high performance and reliability for exabyte-scale workloads. Requires 8+ years in Java/Scala/C++ and deep distributed systems expertise.
Key Projects
- Apache Spark™: Develop the open source standard for big data processing.
- Data Plane Storage: Build services for cloud storage like AWS S3 and Azure Blob Store.
- Delta Lake: Create storage layer with ACID transactions and time travel.
- Delta Pipelines: Orchestrate thousands of data pipelines.
- Performance Engineering: Optimize query engines for speed and scalability.
Requirements
- BS in Computer Science or equivalent.
- 8+ years production experience in Java, Scala, or C++.
- Strong algorithms, data structures, and distributed systems knowledge.
- Experience with databases and big data systems (Apache Spark™, Hadoop).
Nice-to-Haves
- MS or PhD in databases or distributed systems.
- Comfortable with multi-year visions and customer impact.
Staff Data Platform Engineer
Staff Data Platform Engineer building and leading AWS-native data platform architecture, orchestration, governance, and AI-readiness for analytics and ML workloads. Requires 8-10+ years experience with AWS data systems and strong technical leadership.
Manager, Data Engineering
Lead and mentor a team of data engineers building scalable data pipelines and platform infrastructure. Hands-on coding, operational excellence, and cross-functional collaboration with analytics, data science, and business teams.
Senior Data Engineer, People Analytics
Build and maintain data pipelines, tables, and AI-ready data foundations from HR systems to power People Analytics reporting, dashboards, and LLM tools. Requires 5+ years of data engineering experience with strong SQL, Python, Airflow, and data governance skills.