# Software Engineer, Data Foundations
**Company:** [Glean](https://hotfix.jobs/companies/glean)
**Location:** Unspecified
**Salary:** $140K-$265K
**Experience:** 3+ years
**Skills:** Java, Go, C++, Python, SQL, NoSQL, Distributed Systems, Data Pipelines, Kubernetes, Apache Kafka
**Posted:** 2025-12-08
> Build and scale data ingestion pipelines and connectors for enterprise SaaS apps, transform unstructured data for AI search and agents, ensure reliability and security at petabyte scale. Requires 3+ years backend/data infrastructure experience with distributed systems.
## Job Description
## You will work on:

**Ingestion & Connectivity**
- Build and scale connectors to SaaS and on-prem systems (Google Workspace, Microsoft 365, Slack, Salesforce, Jira, ServiceNow, GitHub, etc.).
- Handle full syncs, low-latency incremental updates via webhooks/APIs, rate-limiting, and complex authentication flows.
- Build advanced capabilities in datasources like actions, live-fetch, and query language support.

**Data Processing & Modeling**
- Transform raw, unstructured enterprise content into rich, structured, permission-aware representations optimized for search and LLM reasoning.
- Design document schemas and enrichment pipelines (entity extraction, access-graph propagation, redactions, etc.).
- Expand AI products through deep integrations for task automation, complex queries, and live data enhancement.

**Reliability & Distributed Systems**
- Own end-to-end correctness, freshness, and performance for petabyte-scale data flows.
- Solve problems in ordering, idempotency, exactly-once processing, backpressure, and retries across distributed queues, workers, and storage.

**Security & Permissions**
- Preserve fine-grained ACLs, deletions, and sensitivity constraints so AI answers are grounded in user permissions.

**Cross-Functional Impact**
- Partner with Search Serving, Product, Platforms, and Security teams to define enterprise context exposure to LLMs and agents.
- Improve observability, alerting, and automation for larger customers and data sources.

## About you:
- 3+ years building production backend or data infrastructure systems (**Java**, **Go**, **C++**, **Python**, etc.).
- Hands-on experience with distributed systems, data pipelines, queues, and large-scale storage (**SQL**/**NoSQL**).
- Think in SLOs, error budgets, failure modes, and correctness guarantees.
- Comfortable with strict consistency and permission-modeling challenges.
- Prior work on enterprise connectors, search/indexing, information retrieval, or security-sensitive systems is a strong plus.
- Passionate about trustworthy AI via rock-solid data foundations.
- Power user of LLMs and AI tools.

## Compensation & Benefits
Base salary range: **$140,000 - $265,000** annually (varies by location, level, knowledge, skills, experience). Eligible for variable compensation, equity, and benefits including medical, vision, dental, time-off, 401k, stipends, events, and daily lunches.
**Apply:** https://hotfix.jobs/jobs/software-engineer-data-foundations-at-glean-848e9c69-d824-44d5-8469-f10c9184ba01
**Canonical:** https://hotfix.jobs/jobs/software-engineer-data-foundations-at-glean-848e9c69-d824-44d5-8469-f10c9184ba01