kshi-glitch/BigDataOps_Lab

This project demonstrates real-world big data engineering practices using Apache Spark (PySpark). It covers the entire data pipeline — from ingestion, transformation, and validation to exploration and reporting. Ideal for data engineers and analysts looking to gain practical experience with Spark, Airflow, and data lake design.

big-data data-engin data-engineering ecommerce hadoop hive hiveql nyse-stocks retail-data

No README found.

Languages

Python48.0%Shell37.6%HiveQL13.6%Scala0.8%

Contributors

Created May 28, 2025

Updated May 28, 2025