Back to projects

resumai

Python

An LLM inference pipeline that scores millions of cover letter–job posting pairs for quality and relevance. Traditional text analysis couldn’t judge whether a letter actually addressed a job; this pipeline runs each pair through an open-weight model multiple times, verifies score stability, and scales from a 1K dev set to 5M production pairs.

Scale

1K dev set → 5M production pairs

Capability

LLM-based relevance scoring at research scale

Reproducibility

Open-weight models, pinned versions

Batch inference pipelineStructured prompting and scoring logicValidation interfaceReproducibility and logging framework
Client relationship ownershipTimeline management
  • Python
  • AWS Bedrock
  • AWS SDK
  • Open-weight models over commercial APIs — commercial providers retire models without notice, which would make published results irreproducible. Pinning an open-weight model version keeps the analysis stable for years
  • AWS Bedrock batch inference allowed us to hit a tight data processing timeline