We’re looking for a Data Engineer to help build and maintain the data pipelines that power our AI-and NLP-driven narrative analysis platform. You’ll work with everything from structured financial data to unstructured news and social media content.
The Role
You’ll be responsible for building reliable data pipelines that feed our AI/LLM systems and maintain our data warehouse. This includes handling data ingestion, processing LLM outputs, and creating batch processes that keep our analytics platform running smoothly.
What You’ll Do
- Build and maintain data pipelines for processing both structured and unstructured data;
- Create and optimize ETL/ELT processes in Snowflake, DBT;
- Develop scripts for web scraping and API integrations to capture new data sources;
- Implement batch processing jobs for overnight data products with strict SLAs;
- Work with the team to model and structure new datasets;
- Maintain data quality and pipeline monitoring; and
- Process and structure outputs from large-scale batch LLM operations.
Technical Requirements
- 1-3 years of data engineering experience;
- Strong Python programming skills, proficient in SQL and data warehouse concepts;
- Experience with AWS cloud services;
- Understanding of ETL/ELT principles and batch processing;
- Familiarity with API integration and web scraping; and
- Knowledge of data modeling concepts.
Nice to Have
- Experience working with LLMs or NLP;
- Familiarity with unstructured data processing;
- Knowledge of Snowflake-specific administration, UDF design, and SQL dialect;
- Experience with data orchestration tools (e.g. Airflow, Prefect, AWS Step Functions, etc.)
- Experience with infrastructure automation (Terraform, Pulumi, etc.); and
- Knowledge of monitoring and observability practices.
What Makes a Good Fit
- You care about building reliable, maintainable systems;
- You’re curious about working with new types of data;
- You’re comfortable learning new tools and approaches; and
- You can balance getting things done with doing things right.
What We Offer
- All-in compensation of $100-140k;
- Equity upside;
- Chance to work with cutting-edge AI/LLM technology;
- Mix of traditional data engineering and innovative new approaches;
- Direct impact on a growing analytics platform; and
- Small team where your work matters;
- Location: Fairfield County, CT with partial remote possible
How to Apply
This is an early career position with room to grow.
Send to [email protected] (1) a resume and (2) a brief cover/introduction email if you think there are important things your resume doesn’t adequately cover. Help us out by including the position you are applying for in the subject line of your email.