Member of Technical Staff
Design and maintain large-scale backend infrastructure for distributed ML training, inference, and data pipelines at a generative AI startup. Requires 4+ years building scalable cloud systems with Python/Go/C++ and distributed data technologies.
Responsibilities
- Design, develop, and maintain large-scale backend and cloud-native infrastructure to support distributed machine learning training, inference, and data processing pipelines for generative AI platform.
- Architect and build scalable, resilient backend infrastructure to support distributed training, inference, and data processing pipelines.
- Lead technical design discussions, mentor engineers, and establish best practices for large-scale machine learning systems.
- Design and implement core backend services with a focus on efficiency and low latency.
- Drive infrastructure optimization initiatives for compute cost, storage lifecycle management, and network performance.
- Collaborate with machine learning, DevOps, and product teams to translate research and product requirements into robust infrastructure solutions.
- Evaluate and integrate cloud-native and open-source technologies such as Kubernetes, Ray, Kubeflow, and MLFlow to enhance platform reliability.
- Own end-to-end systems from design to deployment, emphasizing reliability, fault tolerance, and operational excellence.
Requirements
- Bachelor’s degree or equivalent in Computer Science or related field plus four (4) years of experience in software engineering or related role.
- 4 years of experience designing, building, and optimizing large-scale backend infrastructure and distributed data systems (e.g., PostgreSQL, MySQL, DynamoDB, Apache Spark, Apache Flink, Apache Kafka) in cloud environments (AWS, GCP, Azure, or equivalent), including cloud-native platforms, core infrastructure components, and optimization techniques (caching, indexing, sharding, replication, transactions, ACID).
- 4 years of experience with major server-side programming languages and frameworks (e.g., Python, C++, Go, TypeScript).
- 4 years of experience writing technical design documentation, leading cross-functional projects, and collaborating with cross-functional teams to achieve business impact.
- 3 years of experience developing and maintaining data processing and API systems, including client-server communication frameworks (e.g., gRPC, Thrift).
- 3 years of experience conducting A/B testing and scientific experimentation (e.g., Statsig, Meta Deltoid, Optimizely) to measure software impact.
- 3 years of experience conducting coding interviews and providing systematic feedback for engineering candidates.
- 2 years of experience with cloud-native tools and infrastructure, such as Docker and Kubernetes.
- 2 years of experience defining and implementing data-driven metrics to support company or team goals.
Compensation & Benefits
- Base Pay Range: $175,000–$220,000 USD (plus equity)
- Total compensation includes meaningful equity in a fast-growing startup, along with a competitive salary and comprehensive benefits package.
Senior Software Engineer, Strategic Integrations
Senior engineer leading platform quality, legacy migration, and observability for enterprise partner integrations. Requires strong backend experience, third-party API integration at scale, and incremental migration expertise.
Software Engineer III (Ruby on Rails)
Own end-to-end feature development on Rails-based backend services powering feeds and profiles. Deliver complex work, guide junior engineers, and contribute to architectural decisions in a fully remote environment.
Staff Software Engineer
Staff engineer on the Containers team owning complex technical components of Chainguard Images, driving long-term technical direction, and mentoring engineers. Requires 10+ years experience, deep expertise in containers/Kubernetes/Go, and IaC.
Member of Technical Staff, Core Backend
Owns the StreamModule voice pipeline (VAD→STT→LLM→TTS) for real-time call agents. Consolidates BullMQ to Kafka, hardens provider abstractions, adds OTEL tracing, and eliminates Postgres SPOFs.
Staff Software Engineer, Backend
Staff Backend Engineer building scalable Go services and APIs for Okta's Privileged Access Management platform. Focus on distributed systems, database design, and production reliability for enterprise security infrastructure.