Descrição da vaga
This is a fully remote position, 6 month contract (PJ model) and salary range is 25-30/hr USD.
Must-haves:
- 5+ years of professional experience as a Data Engineer
- Experience with Python and Bash – shell script wrappers and python building data pipelines
- Working with big data, making sure it is optimized
- Airflow experience for orchestration
- Snowflake DB experience/RedShift
- Terraform experience, AWS – S3, SQS
- SQL – adhoc queries and SQL for data pipelines
Plusses:
- AI for generating code – Claude, CodeX, Copilot
- Streaming and batch experience – spark, Kafka
The Search and AI Platform is our client's agentic data platform which powers products and their next-generation LLM-powered research systems.
The platform uses agentic services to interrogate our rich knowledge graphs, search and recommendation systems, and our unparalleled collection of research data to deliver insights to the scientific community so they can collaborate more effectively, work smarter, and deliver quality research more quickly.
We’re looking for an innovative, passionate Senior Data Engineer I to be the senior data engineer on the new AI Content team to help design, build and maintain scalable pipelines to ingest content to a centralized content storage system, to help build and maintain a scalable content enrichment and retrieval system, as well as pipelines to load content search indexes. The newly built systems are in early-stage implementation, and we are giving this team remit to ingest massive amounts of content, which supports AI-powered products across the organization.
You will be
- Designing, prototyping, and building robust and scalable pipelines using AI-assisted, spec-driven development, following clean code and best-practice software engineering principles
- Working with technologies including Python (FastAPI), Spark, Airflow, Snowflake, Iceberg, Kafka and RDBMS
- Building cloud infrastructure in AWS to host and monitor the services, automating common tasks mercilessly