Exploratory data analysis of NYC yellow-taxi trips (2017) to profile trips, identify fare/tip drivers, and surface operations levers for pricing & cashless strategy.
Taxi_Trip_Data.csv
— 22,699 rows × 18 columns (NYC TPEP schema), period 2017-01-01 → 2017-12-31.Key finding 1 — Trip profile & fare drivers
total_amount
≈ $11.80 (p90 ≈ $30.35).total_amount
: fare_amount
0.984, trip_distance
0.902, dur_min
0.162, passenger_count
0.011 → distance/fare dominate; duration/passengers contribute little.fare_per_mile
≈ $5.51 (robust to outliers).Key finding 2 — Payment & tipping behavior
Business impact
Adjust relative paths to your repo layout.
Taxi-Trip-Analysis.ipynb
Taxi_Trip_Data.csv
total_amount
: fare 0.984, distance 0.902, duration 0.162, passengers 0.011.fare_per_mile
median $5.51.Data wrangling (Pandas), datetime feature engineering, outlier handling, correlation analysis, and decision-ready business.