Research Engineer - CUDA Kernel Engineering

Develops and optimizes CUDA kernels for AI models accelerating semiconductor design, verification, and chip optimization across large-scale GPU clusters. Integrates with frameworks like PyTorch and latest NVIDIA hardware for training, inference, and RL workloads.

Palo Alto, CABackend EngineeringOnsite

Apply

About the role

Responsibilities

Develop, integrate, and optimize state-of-the-art CUDA kernels to power AI models accelerating semiconductor design and verification.
Enable large-scale model training, inference, and reinforcement learning systems for circuit layouts, RTL generation/validation, and chip architecture optimization across thousands of GPUs.
Build tools, performance benchmarks, and integration layers to push GPU utilization limits for compute-intensive workloads in AI-driven hardware design.
Collaborate with researchers and engineers; contribute kernels and tooling to open-source AI and HPC ecosystems.

Requirements

Writing and optimizing CUDA kernels for large-scale AI workloads (attention, routing, graph-based operations, physics-inspired operators).
Profiling and optimizing GPU performance for custom compute or memory-bound workloads.
Integrating custom kernels into training and inference frameworks (PyTorch, Megatron, vLLM, TorchTitan).
Working with latest NVIDIA hardware/software (Hopper, Blackwell, NVLink, NCCL, Triton).
Building GPU-accelerated primitives for graph reasoning, symbolic computation, or hardware simulation.
Collaborating with AI researchers and semiconductor experts to translate domain-specific workloads into high-performance GPU code.

Skills

CUDAPyTorchNvidia HopperNvidia BlackwellNvlinkNcclTriton Inference ServerMegatronvLLMTorchtitan

Similar roles

Backend Engineering jobs

Weave

Backend Engineer

Backend engineer on the Data Platform team building scalable, resilient distributed services for large-scale data integration, event processing, and platform extensions. Requires 3+ years backend experience and expertise with distributed systems, messaging, and NoSQL technologies.

Lehi, UTBackend EngineeringRemote3+ YOECGo

Upstart

Software Engineer, Verifications Platform

Design and build backend services powering automated verification workflows, financial data integrations, and approval decisioning for lending products. Requires 3+ years building distributed systems in Kotlin or Java.

142k – 197kUnited StatesBackend EngineeringRemote3+ YOEJavaAPIs

Zoox

Software Engineer

Design and build cloud backend microservices for reliable robot-to-cloud communication, fleet management, and telemetry. Requires 4+ years experience and proficiency in TypeScript, Java, or Python.

153k – 230kFoster City, CABackend EngineeringHybrid4+ YOEJavaRest

Chime

Software Engineer, Risk

Build and evolve Chime's risk platform and architecture as a backend-focused engineer on the Trust and Safety team. Requires 3+ years of production software experience and Ruby on Rails or comparable frameworks.

133k – 184kChicago, ILBackend EngineeringHybrid3+ YOEMonitoringDashboards

CrewAI

Software Engineer, Open Source

Core maintainer of the CrewAI open-source Python framework. Designs and maintains agent orchestration APIs, reviews community contributions, and upholds engineering quality in public.

San Francisco, CABackend EngineeringOn-siteUvLLMs