Sr. Production Engineer, Solutions Engineering

140k – 288kChicago, ILCaliforniaRemote5+ YOEJun 12

Summary

Senior Production Engineer building AI agents, platforms, and automation to ensure reliability of Pinterest's large-scale distributed systems serving hundreds of millions of users.

About the role

What you’ll do

Design and build AI agents that augment production reliability work — Develop agents that assist engineers with service health analysis, reliability recommendations, migration playbook generation, and risk identification
Drive large-scale infrastructure modernization with AI-accelerated execution — Lead Kubernetes adoption and platform transitions using AI to generate automation
Transform consulting patterns into scalable platforms — Execute scoped reliability engagements with engineering teams, then encode successful approaches into AI-assisted tools and automation
Build the knowledge infrastructure that powers Pinterest's operational agent ecosystem — Create migration playbooks, operational runbooks, incident patterns, and best practices
Develop software solutions to enable reliability and operability of large-scale distributed systems
Build tools and automation to eliminate toil and reduce operational overhead
Build meaningful, insightful and actionable SLIs — Develop service level indicators that provide clear signals of system health
Automate critical portions of Pinterest's engineering processes
Manage capacity and performance to help scale our infrastructure — Partner with teams to plan and optimize capacity across public and private clouds

What we’re looking for

5+ years of industry experience building and operating large-scale, high-performance distributed systems
Bachelor's degree in Computer Science or related field, or equivalent experience
Strong programming skills in Python or Go — ability to build production-grade platforms, agents, and automation
Deep knowledge of Linux/Unix internals and experience with open source infrastructure (MySQL, Kafka, Envoy, Hadoop, etc.)
Infrastructure as Code experience (Terraform, Puppet, Chef, Ansible, Docker, Kubernetes)
Experience deploying web applications to cloud infrastructure (AWS, GCP, or Azure) and working with distributed, service-oriented architecture

Preferred

Experience developing AI agents for infrastructure automation, operational decision-making, or reliability workflows
AI/ML infrastructure experience (LLM-based systems, model serving, agentic workflows)
Technical consulting or embedded SRE experience with cross-functional engineering teams

Skills

PythonGoLinuxMySQLKafkaEnvoyHadoopTerraformPuppetChefAnsibleDockerKubernetesAWSGCP

Similar roles at this salary range

All DevOps / SRE jobs →

Komodo Health

Jun 12

Senior Data Engineer, Sentinel (Pacific Time Zone)

Senior Infrastructure Engineer building and operating AWS cloud infrastructure for healthcare data platform. Requires Python, Terraform, CI/CD expertise, and big data tools experience.

153k – 210kUnited StatesDevOps / SRERemote5+ YOEAWSVPC

Chime

Jun 12

Software Engineer, Infrastructure

Build and operate foundational data infrastructure including Airflow, Flink, DynamoDB, and RDS using Terraform and Kubernetes. Requires 2-4 years of infrastructure/platform experience and strong Python skills.

133k – 184kUnited StatesDevOps / SRERemote2+ YOEAWSRDS

Retool

Jun 11

Software Engineer, Developer Experience

Build internal AI tools and autonomous agents that embed into Retool's engineering workflows to boost developer productivity and reduce toil. Requires shipping real AI-powered developer tools and infrastructure.

155k – 315kSan Francisco, CADevOps / SREHybrid5+ YOELLMsAI agents

Pump.co

Jun 11

DevOps Engineer

Hands-on DevOps role owning AWS infrastructure, building developer tooling, and driving technical roadmap at an early-stage YC startup. Requires 6+ years infra/DevOps experience and strong AWS/K8s/Terraform skills.

140k – 200kSan Francisco, CADevOps / SREOn-site6+ YOEAWSSQL

Applied Intuition

Jun 11

Senior Asset Pipeline Engineer

Design and own the OpenUSD-based asset pipeline for a high-fidelity sensor simulation platform. Build automated DCC-to-engine pipelines, custom schemas, material conversion, and validation systems at library scale.

151k – 230kSunnyvale, CADevOps / SREOn-site5+ YOEMDLCI/CD

Apply