Hamagistral/NYCTaxi-Analytics-ETL
π Performing Data Analytics on NYC Taxi data using GCP and MageAI
π NYC Taxi Trip Records Data Analysis
Data Engineering Project Using GCP & MageAI
Dashboard π Data βοΈ Request Featureπ― Goal
The goal of this project is to perform data analytics on NYC Taxi Trip Records using various tools and technologies, including GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio.
πΎ Dataset Used
Yellow trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts. The data used in the attached datasets were collected and provided to the NYC Taxi and Limousine Commission (TLC) by technology providers authorized under the Taxicab & Livery Passenger Enhancement Programs (TPEP/LPEP).
More info about the dataset can be found here :
- Website - https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
- Data Dictionary - https://www.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf
π Dashboard
π΅οΈ Key Insights
-
π§³ Total Trips
- "VeriFone Inc" is the provider with the most number of trips with over 88k trips and "Creative Mobile Technologies" with only 11k trips.
-
π³ Top Payment Types
- NΒ°1: Credit Card with 66%
- NΒ°2: Cash with 33%
-
π¨βπ©βπ§βπ§ Number of passengers by trip
- 65% of the trips have only 1 passenger.
- 13% have 2 passengers.
- 8% have 5 passengers.
-
π΅ Common Rate Code
- The most common final rate code in effect at the end of the trip is the "Standard rate" with over 97%, followed by JFK with 2.2%, Negotiated fare etc. with less than 1%



