# Post-Training Research Engineer
**Company:** [Baseten](https://hotfix.jobs/companies/baseten)
**Location:** San Francisco, CA
**Salary:** $200K-$275K
**Skills:** PyTorch, TensorFlow, JAX, Kubernetes, Slurm, Ray, Dask, InfiniBand, Roce, Gpudirect
**Posted:** 2026-03-23
> Build in-house tooling for post-training custom ML models using advanced techniques like RL and finetuning. Requires deep expertise in transformer training, PyTorch distributed systems, parallelism strategies, GPU performance optimization, and HPC platforms.
## Job Description
## Responsibilities
- Build in-house tooling to support post-training of custom models, including reinforcement learning, supervised finetuning, and in-house research techniques.
- Train a wide spectrum of model architectures with various techniques efficiently and at scale.
- Work across the stack: systems-level concepts like Kubernetes, cgroups, storage systems, and networking topologies; PyTorch distributed tensor computation; GPU kernels.

## Requirements
- Deep understanding of modern ML techniques and tools for training transformers.
- Advanced experience in a tensor/array computation library like **PyTorch**, **TensorFlow**, **Jax**, or similar.
- Detailed understanding of transformer training parallelism strategies like data parallelism, sharded data parallelism, tensor parallelism, pipeline parallelism, context parallelism.
- Experience and knowledge to profile and improve the performance of a distributed GPU program in PyTorch or similar.
- Ability to perform roofline analysis on a transformer training setup.
- Willingness to dive into messy problems, work with researchers, derive specifications, and execute.
- Familiarity with HPC and distributed computing platforms like **Slurm**, **Ray**, **Kubernetes**, **Dask**.
- Familiarity with cluster networking technology like Infiniband, RoCE, GPUDirect.
- Solid fundamentals in operating systems concepts like processes, files, kernel drivers, containerisation, and networking protocols.
- Sense of creativity and willingness to ask difficult questions about approach, assumptions, and tooling choices.

## Benefits
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents.
- Generous PTO policy including company wide Winter Break.
- Paid parental leave.
- Company-facilitated 401(k).
- Exposure to a variety of ML startups.
**Apply:** https://hotfix.jobs/jobs/post-training-research-engineer-at-baseten-433de0d7-0639-476e-81c6-47a4d608f875
**Canonical:** https://hotfix.jobs/jobs/post-training-research-engineer-at-baseten-433de0d7-0639-476e-81c6-47a4d608f875