Software Engineer, Production Engineering

Software Engineer in Production Engineering ensures reliability, scalability, and performance of Figma's services. Drives infrastructure initiatives, debugs production issues, and builds operational tools. Requires 5+ years experience with large-scale systems and cloud infrastructure.

149k – 350kSan Francisco, CANew York, NYDevOps / SRERemote5+ YOE

Apply

About the role

What you'll do at Figma:

Work closely with the engineering team to define standard methodologies and goals around reliability, durability, scalability, and performance
Address common operational challenges through better telemetry and by building tools / services
Debug production issues across services and levels of the stack
Participate in design reviews and production reviews for new features, products or infrastructure components
Plan for the growth of Figma's infrastructure
Operate and maintain AWS Infrastructure

We'd love to hear from you if you have:

5+ years of Software Engineering experience operating infrastructure components / services at scale. Proven grasp of Computer Science fundamentals and strong interest in distributed systems
Experience managing infrastructure services in AWS, Microsoft Azure or Google Cloud
Track record for diagnosing problems within complex systems

While it's not required, it's an added plus if you also have:

Excellent problem solving skills and technical communication skills
Care deeply about software/system quality with artisan mentality
Demonstrated unwavering commitment to operational security and best practices

Skills

AWSDistributed SystemsComputer Science FundamentalsTelemetryInfrastructure As CodeGCPMicrosoft AzureProduction DebuggingScalabilityReliability Engineering

Similar roles

DevOps / SRE jobs

xAI

Software Engineer - Networking Software and Services

Build software, services, and frameworks for network management, automation, and monitoring of large-scale GPU supercomputing fabrics. Requires deep network protocol knowledge and experience orchestrating tens of thousands of devices.

150k – 250kPalo Alto, CA +1DevOps / SREHybrid5+ YOEGoBGP

Rillet

Software Engineer, Platform

Own infrastructure, CI/CD, and developer tooling for a fast-scaling AI-native ERP. Set technical direction for reliability, security, and API design in a hybrid NYC/SF environment.

150k – 270kNew York, NY +1DevOps / SREHybrid5+ YOEAWSCI/CD

Capacity

Software Engineer, Enablement

Design, build, and operate AI-powered engineering tools and developer productivity platforms. Focus on AI pairing pipelines, automated workflows, and internal tooling to accelerate engineering velocity.

150k – 180kUnited StatesDevOps / SRERemote3+ YOEGoLLMs

Flint

Infrastructure Engineer

Flint is seeking an Infrastructure Engineer to own the systems powering their AI-generated pages at scale. This 0-to-1 role involves building production-grade cloud architecture, CI/CD, deployments, observability, and security, with a focus on managing parallel background agents.

150k – 250kSan Francisco, CADevOps / SREOn-siteAWSGCP

Reducto

Infrastructure Engineer

Founding Infrastructure Engineer to architect and scale resilient systems for AI/ML workloads, implement monitoring/observability, and automate infrastructure. Requires 5+ years production experience, Python, Kubernetes, and strong reliability focus.

150k – 300kSan Francisco, CADevOps / SREOn-site5+ YOEPythonKubernetes