Sr Production Engineer- Public Sector
Senior Production Engineer owns secure cloud infrastructure, IAM, networking, and automation across AWS, Azure, and GCP for public sector and regulated environments. Requires 5+ years experience, cloud expertise, security clearance eligibility, and strong operational skills.
Responsibilities
Security-Focused Cloud Operations
- Design, automate, and operate the IAM, account/subscription, and project lifecycle across AWS, Azure, and GCP, enforcing least-privilege and standardized access patterns at scale.
- Review, implement, and continuously improve cloud identity and access policies (IAM, Okta, Opal) to align with Databricks security standards and audit requirements.
Production Engineering & Automation
- Build and maintain reliable, observable automation and tooling to apply cloud changes (roles, policies, accounts, networking) safely and repeatedly.
- Treat operational and security issues as software problems: eliminate toil, drive root-cause analysis, and codify fixes into infrastructure and tooling.
Security Data Pipelines & Compliance
- Own and improve security and audit logging data pipelines from cloud providers into our internal systems, ensuring timely, accurate data for detection, investigations, and audits.
- Partner with Security, Compliance, and Audit teams to provide evidence, clarifications, and policy updates that keep our environments aligned with evolving standards.
Regulated & Specialized Environments
- Operate and improve specialized, highly regulated environments (e.g., FedRAMP / GovCloud) including release management, patching cadences, and supporting secure access workflows (e.g., SAW).
- Ensure high availability and resiliency for critical security and access infrastructure across these environments.
On-Call & Incident Response
- Participate in a 24x7 on-call rotation for high-severity incidents impacting cloud accounts, IAM, or security data pipelines.
- Act as a key partner to product engineering, security engineering, and field teams during incidents to restore service and harden systems for the future.
Requirements
Education: BS, MS, or PhD in Computer Science, Engineering, or a related technical field, or equivalent practical experience.
Experience: 5+ years of experience in production engineering, SRE, security engineering, or cloud infrastructure roles.
Cloud & Infrastructure Expertise:
- Deep hands-on experience with at least one major cloud provider (AWS, Azure, or GCP) in areas such as IAM, networking, accounts/subscriptions/projects, and audit logging.
- Strong background in Infrastructure-as-Code and automation (e.g., Terraform, CloudFormation, or similar) and CI/CD for infrastructure changes.
Security & Compliance Mindset:
- Proven experience working in or with security-sensitive or regulated environments (e.g., SOC2, FedRAMP, ISO 27001, financial services, public sector) and translating requirements into concrete technical controls.
- Familiarity with access review processes, policy baselines, and audit evidence for cloud environment.
Operational Excellence:
- Demonstrated success running high-availability, security-critical services, including on-call responsibilities and incident management.
- Strong debugging and problem-solving skills across distributed systems, with the ability to navigate ambiguous issues spanning multiple teams and platforms.
Required: Candidates must be eligible for a Top Secret / Sensitive Compartmented Information (TS/SCI) security clearance.
Nice-to-Haves / Bonus
- Possession of a current polygraph (Counterintelligence or Full Scope).
- Experience with Okta, Opal, or similar identity/access tooling.
- Background operating secure admin workstations (SAW) or comparable hardened access patterns.
- Experience migrating cloud accounts or subscriptions during M&A or large-scale reorganizations.
Senior Infrastructure Engineer
Build analytics infrastructure, observability tooling, and developer platforms to support real-time AI agents for 911 centers. Requires 4+ years infrastructure/platform/backend experience and comfort across the full stack.
Senior Developer Experience Engineer
Senior Platform Engineer focused on Developer Experience building tools, automation, CI/CD systems, and AI tooling to improve developer productivity and workflows. Requires 7+ years cloud experience, containerization, and proficiency in Ruby, Go, or Python.
Senior Site Reliability Engineer
Senior SRE to operate and evolve EKS Kubernetes platform, CI/CD pipelines, and observability stack for Thunderbird's open-source infrastructure. Requires 7+ years infrastructure experience and strong production Kubernetes and IaC skills.