Senior Manager, DevOps Engineering
Lead and mentor a team of DevOps and Infrastructure Engineers responsible for build pipelines, CI/CD systems, developer tooling, and release infrastructure across Hivemind Solutions. Drive modernization of C++/Python build ecosystems and ensure scalable, secure software delivery pipelines.
Responsibilities
- Manage and grow a team of specialized DevOps, build, and infrastructure engineers. Set clear technical goals, conduct performance reviews, provide mentorship, and foster a culture of high operational excellence.
- Maintain, optimize, and standardize C++, Python, and Node.js build ecosystems using tools like Conan, CMake, and Poetry; lead the technical roadmap and migration strategy for reproducible environment tooling such as Nix.
- Partner closely with the Hivemind Platform (HMP) team to ensure seamless technical alignment, common build patterns, and efficient packaging/dependency update strategies.
- Own the performance, security, and scaling of repository hosting and CI/CD pipelines in GitLab. Partner on integrating GitLab with Jira and Confluence.
- Govern Artifactory remote registries (including internal C++, Python, Docker, and restricted ITAR/CUI repositories) and orchestrate secure container build workflows using BuildKit/Docker with strict credential containment.
- Champion the "Developer Experience" (DevEx) by optimizing local development environments, streamlining workstation setup scripts, building shared developer CLI utilities, and championing best practices for documentation and workspace templates.
- Support the deployment, scaling, and orchestration of the Forge simulation and test orchestrator within Gov Cloud environments. Partner with simulation and test teams to optimize test runner performance and automate simulation-artifact collection pipelines.
- Collaborate cross-functionally with the Hivemind Platform (HMP) team, Autonomy, Systems Integration, Security, and IT groups to align DevOps roadmaps with broader Hivemind Solutions milestones.
Requirements
- Typically requires a Bachelor's degree in Computer Science, Computer Engineering, or related technical discipline and 7+ years of related experience (or 5+ years with a Master’s, or equivalent practical experience), including 2+ years of direct people leadership, team lead, or engineering management experience.
- Strong practical experience managing enterprise C/C++ build systems (CMake) and package management (Conan) alongside Python package management (Poetry, pip).
- Proven track record building, maintaining, and scaling enterprise CI/CD systems, with deep expertise in GitLab CI/CD pipelines and runner infrastructure.
- Direct experience with Jira, Confluence, and GitLab integrations to automate releases, track work, manage sprint cycles, and maintain engineering documentation.
- Hands-on experience managing Artifactory (or Nexus), Docker registries, and multi-stage Docker builds optimized for size, speed, and security (BuildKit secrets).
- Familiarity with deploying and managing test runners or applications in cloud environments (such as Azure Gov Cloud or AWS).
- Familiarity with declarative configuration languages (CUE, JSON, YAML) and container orchestration tools (Docker Compose, Kubernetes, or similar).
- Solid Linux (Ubuntu/RedHat) system administration skills, including debugging tools, performance tuning, and scripting (Bash, Python) to troubleshoot developer environments and pipelines.
- Experience with secure software development practices, secret management, identity providers (Active Directory, Entra ID), and managing software compilation/deployment under compliance frameworks (such as NIST or ITAR/CUI-governed environments).
- Ability to obtain and maintain an active U.S. SECRET security clearance (U.S. citizenship required).
Nice-to-Haves
- Familiarity with Nix or other reproducible development environments, package managers, and declarative build systems is highly preferred.
Senior Network & Site Reliability Engineer
Design, operate, and automate the global network and reliability layer for a high-performance NVIDIA DGX SuperPOD supporting ML workloads. Own architecture, observability, incident response, and security for mission-critical infrastructure.
Senior Data Engineer, Sentinel (Pacific Time Zone)
Senior Infrastructure Engineer building and operating AWS cloud infrastructure for healthcare data platform. Requires Python, Terraform, CI/CD expertise, and big data tools experience.
Senior Software Engineer - Observability Visibility
Senior engineer building observability and resilience standards, tooling, and automation to make reliability the default across Datadog services. Requires 5+ years experience, Go/Python skills, and AI feature delivery experience.
Software Engineer, Developer Experience
Build internal AI tools and autonomous agents that embed into Retool's engineering workflows to boost developer productivity and reduce toil. Requires shipping real AI-powered developer tools and infrastructure.
Senior Asset Pipeline Engineer
Design and own the OpenUSD-based asset pipeline for a high-fidelity sensor simulation platform. Build automated DCC-to-engine pipelines, custom schemas, material conversion, and validation systems at library scale.