Web Data Collection & Modeling

Description

This project is a companion to the pancake classiffication app listed on the “Shiny” page of this site.

Goal

Collect recipes to train a machine learning model to classify recipes as pancakes or not pancakes.

Outcome

A data set containing roughly 3,200 recipes.

Technology Used

  • Tidyverse (dplyr, purrr)
  • rvest
  • SMOTE (Synthetic Minority Oversampling Technique)
  • Random Forest

Skills

  • Data cleaning and wrangling
  • Web scraping

View on GitHub