GitHunt
KS

kshi-glitch/BigDataOps_Lab

This project demonstrates real-world big data engineering practices using Apache Spark (PySpark). It covers the entire data pipeline — from ingestion, transformation, and validation to exploration and reporting. Ideal for data engineers and analysts looking to gain practical experience with Spark, Airflow, and data lake design.

No README found.

Languages

Python48.0%Shell37.6%HiveQL13.6%Scala0.8%

Contributors

Created May 28, 2025
Updated May 28, 2025
kshi-glitch/BigDataOps_Lab | GitHunt