Skip to content

Data Engineer

New York, NYSan Francisco, CAData EngineeringRemote5+ YOE
Summary

Builds and maintains production-grade data processing systems and storage infrastructure for advanced AI platforms. Requires 5+ years experience with Python, SQL, distributed frameworks like Spark/Beam/Flink, and cloud storage like S3/GCS.

About the role

Responsibilities

  • Work directly on storage infrastructure, product launches, and new customer experiences built on one of the most advanced AI systems in the world
  • Collaborate daily with researchers and engineers
  • Run implementations end-to-end and see initiatives through to real outcomes
  • Partner across research, marketing, sales, and finance to help define how Cohere grows, with your recommendations feeding directly into products and strategy

Requirements

  • 5+ years of experience working on production-grade data processing systems
  • Strong command of Python and SQL
  • Experience with distributed data processing frameworks such as Apache Beam, Spark, or Flink
  • The ability to transform unstructured data into performant datasets across diverse storage backends including S3, GCS, and POSIX

Nice-to-haves

  • Experience with modern orchestration platforms, especially Kubernetes
  • Familiarity with modern analytics stack tooling such as BigQuery, Airflow, or dbt
  • Knowledge of Java or Golang
  • Genuine excitement about AI
Skills
PythonSQLApache BeamSparkFlinkS3GCSKubernetesBigQueryAirflow