# Staff Software Engineer, ML Performance & Systems
**Company:** [Fal](https://hotfix.jobs/companies/fal)
**Location:** San Francisco, CA
**Salary:** $180K-$250K
**Skills:** PyTorch, TensorRT, Transformerengine, Nsight, Triton, Cutlass, Nvidia Hardware, Model Compilation, Quantization, Model Serving, Ring Attention, Fa3, Fusedmlp
**Posted:** 2025-12-16
> Designs and implements novel model serving architectures on in-house inference engine to maximize throughput and minimize latency for generative media models. Develops performance tools and collaborates with ML teams on Nvidia-based systems optimizations.
## Job Description
## Key Responsibilities

- Help fal maintain its frontier position on model performance for generative media models.
- Design and implement novel approaches to model serving architecture on top of our in-house inference engine, focusing on maximizing throughput while minimizing latency and resource usage.
- Develop performance monitoring and profiling tools to identify bottlenecks and optimization opportunities.
- Work closely with our Applied ML team and customers (frontier labs on the media space) and make sure their workloads benefit from our accelerator.

## Requirements

- Strong foundation in systems programming with expertise in identifying and fixing bottlenecks.
- Deep understanding of cutting edge ML infrastructure stack (PyTorch, TensorRT, TransformerEngine, Nsight), including model compilation, quantization, and serving architectures.
- Fundamental view of underlying hardware (Nvidia based systems), including custom GEMM kernels with CUTLASS.
- Proficient in Triton or comparable experience in lower-level accelerator programming.
- Experience with multi-dimensional model parallelism (TP with context/sequence parallel).
- Familiar with internals of Ring Attention, FA3, FusedMLP implementations.

## Compensation

$180,000 - $250,000 + equity + comprehensive benefits package
**Apply:** https://hotfix.jobs/jobs/staff-software-engineer-ml-performance-systems-at-fal-c415cae2-f19b-43b9-b664-17e055276dec
**Canonical:** https://hotfix.jobs/jobs/staff-software-engineer-ml-performance-systems-at-fal-c415cae2-f19b-43b9-b664-17e055276dec