GitHunt
US

ussamaRehman/restaurant-health-predictor

Predict NYC restaurant inspection grades (A vs B/C+) with leak-safe, time-split ML baselines.

Restaurant Health Grade Predictor (NYC)

CI

Predict whether a restaurant will receive Grade A vs Grade B/C (or worse) on its next NYC health inspection.

Dataset

  • NYC OpenData (Socrata) inspections dataset id: 43nn-pn8j
  • We fetch the last 3 years via the API and cache a raw snapshot under data/raw/ for reproducibility.

Setup

This repo assumes Python via pyenv (see .python-version).

make setup

Pipeline

One command:

make ml

Step-by-step:

make data
make preprocess
make train
make eval

Results

Latest run (default split: last 20% by inspection_date_t1, threshold=0.5):

  • Logistic regression: precision_fail=0.249, recall_fail=0.951, f1_fail=0.394 (n_test=5671)
  • Persistence (last observed grade up to t): precision_fail=0.241, recall_fail=0.291, f1_fail=0.263
  • RandomForest: precision_fail=0.337, recall_fail=0.393, f1_fail=0.363
  • Run make eval to generate reports/confusion_matrix.csv locally

Threshold tradeoff (logistic regression, FAIL class):

  • At threshold=0.5: precision_fail≈0.249, recall_fail≈0.947, f1_fail≈0.395
  • At threshold=0.7: precision_fail≈0.387, recall_fail≈0.520, f1_fail≈0.444

Raising the threshold improves precision (fewer false alarms) but reduces recall (more missed B/C+).

Threshold tradeoff (RandomForest, FAIL class):

  • At threshold=0.5: precision_fail≈0.337, recall_fail≈0.393, f1_fail≈0.363
  • At threshold=0.7: precision_fail≈0.356, recall_fail≈0.113, f1_fail≈0.171

For this run, RF’s best F1 among {0.5,0.7} is at 0.5.

Run make eval to generate metrics under reports/ locally (focus is precision/recall/F1 on the “fail” class = B/C+). Reports are generated locally and not tracked in git.

Leakage constraints (summary)

  • Rows are inspection events t.
  • Features may use information available up to and including t (including grade/score/violations at t).
  • Target is grade at t+1.
  • Never include any t+1 fields in features; split is time-based on t+1 date.

See docs/leakage.md.

Limitations

  • Observational data; policies and inspection practices can change over time.
  • Restaurants with sparse history are harder to model.

Ethics note

Predictions should not be used to penalize businesses without due process; outputs are probabilistic and may encode historical biases in enforcement and reporting.

License: MIT.

Languages

Python92.0%Makefile8.0%

Contributors

MIT License
Created December 30, 2025
Updated December 30, 2025
ussamaRehman/restaurant-health-predictor | GitHunt