An open-source platform for reading, annotating, and linking texts. Built originally to support literary scholarship on The Tale of Genji, it gives scholars and students tools to connect multiple translations across languages to one another and hundreds of years of material culture.
Projects
A modular, open-source platform that centralizes publicly available lung disease data from scattered repositories into a single searchable interface. Researchers can filter across gene expression, microbiome, and other pulmonary datasets and download everything in a consistent, analysis-ready format.
An LLM inference pipeline that scores millions of cover letter–job posting pairs for quality and relevance. Traditional text analysis couldn't judge whether a letter actually addressed a job; this pipeline runs each pair through an open-weight model multiple times, verifies score stability, and scales from a 1K dev set to 5M production pairs.
A machine learning app that answers the only question that matters: does this recipe make pancakes? Trained a Random Forest classifier on 3,200 recipes scraped from the web, then wrapped it in a polished Shiny app. Born from a household mystery — a scrap of paper labeled 'probably pancakes.'
An R package for working with the Airtable API efficiently at scale. Existing packages sent one row per request — with a 5 req/sec rate limit and tables up to 50,000 records, that meant hours of waiting. rairtable batches up to 10 records per request with parallel JSON encoding, and returns tidy data frames that fit into normal Tidyverse workflows.