Senior Platform Engineer
Leads development of DataHub's ingestion framework, building scalable metadata systems, APIs, and event-driven architectures for enterprise AI and data platforms. Requires 4+ years in distributed systems and advanced Python expertise.
Responsibilities
- Build scalable, fault-tolerant ingestion systems for enterprise-scale metadata
- Develop clean, intuitive APIs for our connector ecosystem
- Create event-driven architectures for real-time metadata processing
- Implement schema mapping between diverse systems and DataHub's unified model
- Develop versioning systems for AI assets (training data, model weights, embeddings)
Requirements
- 4+ years building production-grade distributed systems
- Advanced Python expertise with a focus on API design
- Experience with high-scale data processing or integration frameworks
- Strong systems knowledge and distributed architecture experience
- A track record of solving complex technical challenges
Nice-to-Haves
- Experience with DataHub or similar metadata/ETL frameworks (Airflow, Airbyte, dbt)
- Open-source contributions
- Early-stage startup experience
Compensation
Salary Range: $225,000 to $300,000
Staff Site Reliability Engineer, Release Engineering
Staff SRE on the Release Engineering team defining and scaling reliability practices, architecting SLO/error-budget programs, and driving progressive delivery and automated safety gates across product engineering.
Staff Site Reliability Engineer - Observability
Staff SRE focused on building and scaling a comprehensive observability platform on GCP using Terraform, Splunk, and Grafana. Requires 5+ years GCP observability experience and strong coding skills in Python or Go.
Lead Voice Infrastructure Engineer
Lead the design and operation of scalable telephony infrastructure powering AI voice agents for accounts receivable workflows, including SIP trunking, call routing, realtime media, and integrations with speech systems.
Senior Platform Reliability Engineer
Senior Platform Reliability Engineer establishing reliability standards, observability, and incident response practices across engineering teams. Requires 6+ years operating production systems at scale with AWS, Kubernetes, Terraform, and modern observability tooling.