Back to projects
resumai
PythonAn LLM inference pipeline that scores millions of cover letter–job posting pairs for quality and relevance. Traditional text analysis couldn’t judge whether a letter actually addressed a job; this pipeline runs each pair through an open-weight model multiple times, verifies score stability, and scales from a 1K dev set to 5M production pairs.
Scale
1K dev set → 5M production pairs
Capability
LLM-based relevance scoring at research scale
Reproducibility
Open-weight models, pinned versions
What I built
How I led
Stack
- Python
- AWS Bedrock
- AWS SDK
Key decisions
- Open-weight models over commercial APIs — commercial providers retire models without notice, which would make published results irreproducible. Pinning an open-weight model version keeps the analysis stable for years
- AWS Bedrock batch inference allowed us to hit a tight data processing timeline