Senior Site Reliability Engineer - Government Cloud
Build and operate AWS GovCloud infrastructure for federal customers, owning IaC, container pipelines, compliance documentation, and operational tooling. Requires 5+ years AWS experience and FedRAMP familiarity.
What you will be doing
- Building and operating the AWS GovCloud environment that will host Tines for federal customers — from foundational network architecture through to production-ready, assessment-ready infrastructure.
- Designing and implementing repeatable infrastructure-as-code to provision dedicated customer environments.
- Owning the container image pipeline for our government deployment — building, hardening, scanning, and promoting FIPS-compliant images through our CI/CD pipeline using AWS native tooling.
- Identifying and fixing availability risks and monitoring gaps to ensure our government environments stay healthy, observable, and auditable.
- Working closely with our assessment partners to produce the infrastructure documentation, architecture diagrams, and evidence needed for FedRAMP authorization.
- Enabling product engineers to build new features that work seamlessly across our commercial and government environments: observability, logging, and simplifying deployments.
- Defining how we separate compliance-restricted functions from day-to-day engineering operations so the team can ship code and respond to incidents without breaking the security boundary.
- Supporting our self-hosted federal customers operating in our CMMC environment, including handling escalations and complex, long-running support cases as part of the team's on-call responsibilities.
Projects you might work on
- Designing the infrastructure-as-code library for GovCloud customer provisioning — a repeatable process to stand up an isolated environment with all required AWS services pre-configured with FedRAMP-required encryption and logging.
- Building the CI/CD pipeline that promotes container images from development through staging to GovCloud production, with vulnerability scanning gates and change control documentation baked into the workflow.
- Creating operational runbooks for customer provisioning, incident response, patching, and disaster recovery that satisfy our assessment requirements.
- Setting up monitoring dashboards and alarms that feed into a Tines tenant for automated incident triage.
- Building IAM structures and permission boundaries that let engineers deploy and debug in production while maintaining least-privilege access required for compliance.
- Monitoring, scaling, and operating data services like OpenSearch in production — managing indexes and retention, tuning for performance, and building in-product tooling that surfaces cluster health and observability to the team.
- Collaborating with our Product and Design teams to enable compliance-specific product features like smart card authentication and DNS security extensions.
- Writing documentation that helps the broader engineering team understand how to build and test features in a compliance-regulated environment.
Is this the right role for you?
- 5+ years in an infrastructure, DevOps, or cloud engineering role with meaningful time spent in AWS. Direct experience with AWS GovCloud is a strong plus, but deep AWS fluency with a willingness to navigate GovCloud's constraints is what matters most.
- Hands-on experience designing VPC architectures, configuring encryption at rest and in transit, and operating AWS native compute, database, and caching services in production under real workloads.
- Worked with infrastructure-as-code like CDK or Terraform in FedRAMP or CMMC environments, preferably supporting a customer-facing SaaS product.
- Understand what it takes to operate in a compliance-regulated environment. FedRAMP, FISMA, or similar experience is valuable.
- Comfortable with container image pipelines and hardening. Able to reason about base image provenance, vulnerability scanning, and what "hardened" actually means in practice.
- Good instincts for the boundary between "locked down for compliance" and "usable by engineering."
- Can write clearly. This role involves producing tech plans, runbooks, and operational documentation that will be reviewed during our FedRAMP assessment.
- Comfortable learning new technologies. We use Ruby, Rails, React, TypeScript, Postgres, Redis, and Kubernetes.
Senior Staff Engineer, Platform R&D
Senior individual contributor embedded in Crusoe's Managed Platform Services team to accelerate delivery through rapid AI-augmented R&D, prototyping, and cross-domain technical leadership. Requires 10+ years experience with systems languages and cloud-native infrastructure.
Software Engineer, Dev Velocity
Build internal developer platform, tooling, and automation to accelerate engineering velocity. Focus on CI/CD pipelines, test infrastructure, build systems, and metrics to help engineers ship faster and more reliably.
Senior Software Engineer - Developer Platform
Senior engineer building and scaling internal developer platforms with strong focus on AI tooling, reliability, and developer experience. Requires 4+ years in backend/infrastructure and proven project leadership.
Senior Software Engineer, Platform Engineering
Senior Software Engineer building and evolving an internal developer platform including CI/CD, observability, and tooling to improve developer productivity and reliability. Requires 4+ years of production experience in platform/devtools/infrastructure.