Skip to content

Research Engineer — Search/IR

Builds and operates full-stack search and information retrieval systems at massive scale, owning ingestion pipelines, indexing, ranking, freshness, and serving for Firecrawl's web content platform. Requires 3+ years experience with large-scale search infrastructure like BM25 and Elasticsearch.

180k – 290kSan Francisco, CABackend EngineeringRemote3+ YOE

About the role

What You'll Do

  • Build and operate search indexes at massive scale. Design, build, and maintain the indexing infrastructure that powers Firecrawl's core product. You'll handle billions of documents and care about every millisecond of latency and every byte of storage.
  • Own the full stack from ingestion to serving. You don't just build one piece — you own the entire pipeline. Ingestion, processing, indexing, ranking, query understanding, and serving. When something breaks at 3am, you know where to look because you built it.
  • Solve ranking, relevance, and query understanding. Make sure the right content surfaces for the right queries. You'll build and iterate on ranking models, relevance scoring, and query parsing systems that directly impact product quality.
  • Tackle freshness, dedup, and incremental indexing. The web changes constantly. You'll build systems that keep our index fresh without re-crawling everything, deduplicate content intelligently, and handle incremental updates at scale without rebuilding from scratch.
  • Run experiments and ship results to production. You design experiments, measure results rigorously, and ship winners to production fast. You don't need someone to tell you what to try next — you have a backlog of ideas and the judgment to prioritize them.
  • Collaborate closely with the team. Work directly with the RL-focused Research Engineer and the engineering team to connect search/IR improvements with model training and the broader product roadmap.

What We're Looking For

  • Has built search indexes at massive scale. Not a tutorial project — real indexes serving real traffic with real latency requirements. You've dealt with the hard problems: sharding strategies, index compaction, schema evolution, and the operational complexity of keeping billions of documents queryable and fast.
  • Hands-on with ranking, relevance, and query understanding. You've built or meaningfully improved ranking systems. You understand BM25, learned ranking, embedding-based retrieval, and when to use which. You can reason about relevance tradeoffs and you've shipped ranking changes that moved metrics in production.
  • Owns the full stack: ingestion → index → serving. You're not a specialist who only touches one layer. You've built and operated the entire search pipeline — from how documents enter the system to how results get served. You understand the dependencies between layers and make good architectural decisions because you see the whole picture.
  • Has solved freshness, dedup, and incremental indexing problems. You know that building the initial index is the easy part. Keeping it accurate, fresh, and deduplicated at scale is where the real engineering lives. You've built systems that handle continuous updates without full rebuilds and you've debugged the subtle correctness issues that come with incremental processing.
  • Self-directed experimenter who ships without handholding. You generate your own hypotheses, design your own experiments, and ship your own code. You don't wait for a roadmap or a sprint planning meeting. You see what needs to improve, you try something, you measure it, and you ship it if it works.

Backgrounds that tend to do well: Search engineers at companies with large-scale indexes — web search, e-commerce, document search. IR researchers who've shipped their work to production. Infrastructure engineers who've built and operated real-time indexing pipelines. Engineers from Elasticsearch, Algolia, Vespa, or similar search infrastructure teams.

Compensation & Benefits

Salary: $180,000–$290,000/year, based on impact, not tenure Equity: Up to 0.15% PTO: Generous — 15 days mandatory, anything after 24 days, just ask Parental leave: 12 weeks fully paid Wellness stipend: $100/month Learning & Development: Up to $1,000/year Sabbatical: 3 paid months after 4 years

US-based: Full medical/dental/vision (100% employee), 401(k), FSAs, pet insurance

Skills

Bm25Learned RankingEmbedding-Based RetrievalElasticsearchAlgoliaVespaShardingIndex CompactionIncremental IndexingRanking ModelsQuery UnderstandingRelevance Scoring

Software Engineer, Knowledge Systems

Build systems for extracting, connecting, retrieving, and reasoning over knowledge from the web to enable AI agents to answer questions with precision. Requires experience with complex distributed backend systems and petabyte-scale data pipelines.

180k – 350kSan Francisco, CABackend EngineeringOn-site5+ YOEData QualityData Pipelines

Backend Engineer - API

Build and own the xAI API and high-throughput inference infrastructure serving models globally with low latency and high availability. Requires expert Rust or C++ skills and experience with scalable distributed systems.

180k – 440kPalo Alto, CABackend EngineeringOn-site5+ YOEC++Rust

Software Engineer, Control Plane

Builds backend tools for enterprise control plane including access control, change management, and data warehouse optimization. Requires expertise in distributed systems, asynchronous workflows, and product-minded engineering for scalable, reliable architectures.

180k – 260kUnited StatesBackend EngineeringRemoteSnowflakeDatabricks

Software Engineer, Streaming Systems

Builds and scales high-throughput streaming infrastructure to ingest and process billions of customer behavioral events in real-time at sub-second latency. Owns end-to-end projects, solves customer scaling issues, and influences roadmap in a remote-first startup.

180k – 320kUnited StatesBackend EngineeringRemoteKafkaPulsar

Software Engineer, Customer Studio Backend

Builds backend systems for Customer Studio, optimizing data warehouse performance, real-time audience syncing, analytics, and AI-enhanced workflows for marketing teams. Requires strong distributed systems expertise and product-minded thinking.

180k – 320kUnited StatesBackend EngineeringRemoteSQLCdc