Sr Software Engineer, Infrastructure
Senior Software Engineer builds and automates scalable AWS infrastructure, manages Kubernetes clusters, and implements observability frameworks. Requires 5+ years Python experience, IaC expertise, and strong cloud/DevOps skills.
Responsibilities
- Architect and automate production-grade infrastructure on AWS using Terraform or Pulumi.
- Manage and scale containerized workloads using AKS (Azure Kubernetes Service) or EKS, focusing on cluster security and resource efficiency.
- Architect robust deployment pipelines using GitHub Actions, managing both GitHub-hosted and self-hosted runners.
- Create infrastructure for "Observable by Default" frameworks ensuring new applications are secure with logging and metrics enabled.
- Build internal CLI tools, AI plugins, and automation scripts to streamline developer workflows.
- Collaborate cross-functionally with Security, Engineering, Infrastructure, and Support teams.
- Mentor junior engineers, participate in code reviews, and document solutions and failure triage playbooks.
Requirements
- 5+ years production-level experience with strong proficiency in Python (required).
- Expert-level Terraform (modules, state management) or Pulumi (preferred).
- Hands-on experience with AWS (or Azure/GCP), Kubernetes, Docker.
- Experience building/troubleshooting integrations between infrastructure, data pipelines, and observability platforms.
- Advanced knowledge of GitHub Actions, GitHub Runners.
- Strong observability mindset: logging, metrics, tracing; experience with Datadog, Prometheus, or ELK.
- Proficiency in distributed systems concepts like Kafka or messaging queues.
- Ability to operate independently on ambiguous projects.
Senior Infrastructure Engineer
Build analytics infrastructure, observability tooling, and developer platforms to support real-time AI agents for 911 centers. Requires 4+ years infrastructure/platform/backend experience and comfort across the full stack.
Senior Developer Experience Engineer
Senior Platform Engineer focused on Developer Experience building tools, automation, CI/CD systems, and AI tooling to improve developer productivity and workflows. Requires 7+ years cloud experience, containerization, and proficiency in Ruby, Go, or Python.
Senior Site Reliability Engineer
Senior SRE to operate and evolve EKS Kubernetes platform, CI/CD pipelines, and observability stack for Thunderbird's open-source infrastructure. Requires 7+ years infrastructure experience and strong production Kubernetes and IaC skills.