Skip to content

Software Engineer - Hosted Model Infrastructure

This Software Engineer will build and operate high-performance model serving infrastructure, focusing on deploying AI models in various environments. The role involves full-stack development, from inference engines and GPU scheduling to deployment pipelines and observability.

New York, NYML EngineeringHybrid4+ YOE

About the role

Core Responsibilities

  • Building high-performance model serving infrastructure that integrates with security models, hardware constraints, and different inference engines
  • Designing intelligent request handling including authentication, rate limiting, concurrency control, and audit logging for multi-tenant model access
  • Building and maintaining packaging and deployment pipelines enabling fast, secure, and reliable model rollouts across on-premises and air-gapped environments
  • Developing observability for production AI systems to enable easy service monitoring and fast incident triage and resolution
  • Debugging complex issues and performance problems throughout the stack, including open source inference engines, container runtimes, and GPU drivers, in environments you cannot always access directly
  • Designing and running testing and benchmarking infrastructure that validates model deployments across varying GPU hardware before they reach production
  • Working with product teams and customers to understand requirements, debug production issues, and deliver the models and capabilities they need
  • Integrating hosted model infrastructure with Palantir's deployment, configuration, and identity systems

What We Value

  • Ownership mindset and bias toward quality. Our software runs in environments where direct access for debugging is limited or unavailable.
  • High empathy for customer needs and drive to deliver reliable, easy-to-use models
  • Ability to work effectively across multiple languages and layers of the stack, from backend services and ML tooling to container orchestration and deployment configuration
  • Strong debugging skills and motivation to trace problems from application code through containers, orchestration, and hardware
  • Curiosity about emerging AI capabilities and the ability to quickly evaluate and integrate new models and technologies as the landscape evolves
  • Active US Security clearance, or eligibility and willingness to obtain a US Security clearance is beneficial, but not necessary

What We Require

  • 4+ years of professional software engineering experience building and operating production systems
  • Engineering background in Computer Science, Mathematics, Software Engineering, Physics, or similar field
  • Strong coding skills with demonstrated proficiency in programming languages, such as Java, C++, Python, Rust, or similar languages. Familiarity with the Python ML ecosystem is valuable.
  • Experience with containers, Kubernetes, and deploying backend services in production environments
  • Strong written and verbal communication skills and ability to iterate quickly with teammates, incorporating feedback and holding a high bar for quality

Skills

JavaRustPythonGoDockerKubernetesGradleGitHubGPUMachine Learning

Similar roles

ML Engineering jobs

ML Engineer, Agentic Systems

ML Engineer building and improving agentic systems powered by LLMs for multimodal video understanding, reasoning, and creative editing tasks at an AI-native video platform. Requires strong production ML experience with transformers, fine-tuning, and experimental rigor.

175k – 275kNew York, NYML EngineeringOn-siteLLMsPython

Software Engineer, Agents

Design and build agentic systems for AI-native video creation, integrating LLMs and evaluation frameworks to power creative workflows. Requires 5+ years building ML/agentic systems in production.

175k – 275kNew York, NYML EngineeringOn-site5+ YOERAGLLMs

Research Scientist II

Research Scientist II building and improving fraud risk models and scam detection systems using audio, behavioral, and metadata signals. Requires an advanced degree and 3+ years of applied ML experience with Python and modern ML frameworks.

160k – 185kUnited StatesML EngineeringRemote3+ YOELLMsKeras

Research Engineer, Post-Training

Research engineer focused on post-training LLMs and agents for legal work. Requires hands-on experience training open-weight models and strong Python/research engineering skills.

231k – 340kSan Francisco, CAML EngineeringHybridSftRLHF

AI Engineer

Build full-stack AI prototypes and agentic systems to pressure-test venture ideas. Requires 3+ years building production AI applications with strong frontend/backend fluency and frontier coding agent expertise.

150k – 190kMountain View, CAML EngineeringOn-site3+ YOESQLAPIs