Responsibilities
- Deploy and run production-grade ML inference and learning systems on Android Automotive (AAOS)
- Implement on-device multimodal LLMs, including schema design and safe dispatch to local vehicle APIs
- Integrate models using TensorFlow Lite, ONNX Runtime, or specialized vendor SDKs
- Profile and optimize models for strict latency, memory, power, and thermal budgets
- Instrument runtime performance across CPU, GPU, and NPU acceleration layers
- Design safety boundaries and guardrails for model outputs, including tool-call allowlists and fallback logic
- Interface directly with vehicle signals, sensors, and system services using C++ and JNI
Requirements
- BS, MS, or PhD in Computer Science, Electrical Engineering, or a related technical field
- 3+ years of experience shipping ML inference on embedded, mobile, or automotive platforms
- Strong proficiency in C++ and experience with native Android integration (JNI)
- Expertise in model optimization techniques such as quantization, pruning, and compilation
- Experience integrating LLM function calling or tool execution with structured outputs
- Hands-on experience with Android system services or Android Automotive OS (AAOS)
- Deep understanding of edge constraints including real-time behavior and memory pressure
Nice to Have
- Experience with Snapdragon Automotive, ARM Ethos, or specialized NPU pipelines
- Background in running quantized LLMs on-device using llama.cpp or TFLite transformers
- Familiarity with functional safety concepts (ISO 26262), sandboxing, or policy enforcement
- Experience bridging cloud-trained models to resource-constrained embedded runtimes
Compensation
Base salary range: $150,000 - $250,000 USD annually, plus equity and benefits.