Software Engineer, Sensor Integration

Build and maintain ingestion pipelines that convert large-scale geospatial sensor data (LiDAR, imagery) into standardized formats for ML training and product use. Requires strong Python skills, comfort with undocumented formats, and distributed systems experience.

San Francisco, CAData EngineeringHybrid

Apply

About the role

Responsibilities

Own the ingestion pipelines that convert point clouds and imagery from hardware vendors into Mach9's standard internal format
Reverse-engineer new vendor formats and updates — often working only with sparse or missing documentation — to expand what data Mach9 can take in
Build agentic systems to automatically triage failures and reformat data
Build automated checks and regression testing to guarantee the consistency of our data
Optimize the performance of our processing and storage across massive geospatial datasets in the cloud
Work directly with customers and partners to unblock critical customer projects

Requirements

Strong software development and debugging skills
Experience building production software in Python
Comfort operating with ambiguity — ability to dig into undocumented or messy data formats and reverse-engineer them
Strong communication skills, with the ability to work across ML, product, and customer success teams
A foundation in parallel computing or distributed systems
Bachelor's degree in Computer Science, Engineering, or equivalent experience

Nice-to-Haves

Experience building agentic systems and setting up agent harnesses — orchestrating LLM-driven workflows for triage, debugging, or automated code patching
Understanding of geospatial data formats (e.g., LAS/LAZ, COPC, E57, GeoTIFF, Shapefiles) and tooling (e.g., GDAL, PDAL, untwine, laz-perf)
Expertise designing and managing data schemas and storage systems for geospatial data (e.g., Postgres/PostGIS, AWS S3)
Experience with large-scale data processing frameworks and cloud platforms (e.g., Spark, AWS Batch)
Familiarity with coordinate reference systems and transforms (CRS, WKT, pyproj, affine transforms)
Experience building data versioning, lineage, or artifact-tracking systems
Experience operating data pipelines that feed ML training and inference
Familiar with C++

Skills

PythonParallel ComputingDistributed SystemsGdalPdalPostgisAws S3SparkAws BatchC++

Similar roles

Data Engineering jobs

Cursor

Software Engineer, Storage

Software Engineer on the Storage team owning the data layer (databases, caches, scaling strategies) that underpins all Cursor products. Design multi-database architectures, build query guardrails, define storage best practices, and own cache infrastructure for reliability and growth.

San Francisco, CA +1Data EngineeringOn-site5+ YOEOltpMySQL

Machinify

Healthcare Data Analyst

Create advanced SQL/Spark SQL queries and prompt-engineered LLM workflows to transform healthcare claims data into clinical insights and automated policy tools. Requires 3-5 years SQL plus 2-3 years healthcare experience.

140k – 170kUnited StatesData EngineeringRemote3+ YOESQLClaude

Coinbase

Analytics Engineer

Build and maintain data models, pipelines, and dashboards that power customer experience and compliance operations. Partner with CX and compliance teams to deliver trusted, self-serve analytics.

152k – 179kUnited StatesData EngineeringRemote3+ YOESQLdbt

Rad AI

Data Engineer

Senior Data Engineer building scalable data pipelines and infrastructure on AWS using Spark, Metaflow, and container orchestration. Requires 5+ years of experience designing distributed data systems.

145k – 190kUnited StatesData EngineeringRemote5+ YOEAWSSQL

Axle

Data Engineer

Design, build, and maintain data pipelines for biomedical and clinical research datasets. Work with scientists and researchers to deliver accessible, well-governed data products using Python, SQL, and ETL/ELT processes.

Rockville, MDData EngineeringOn-siteSQLETL