GitHunt
PR

Prane23/Azure-Databrick-Synapse-Analytics_End_to_End

The objective of this project is to design and implement a scalable, cloud-based data pipeline using Azure Databricks, Azure Data Lake, and Azure Synapse Analytics

Azure Databricks & Synapse Analytics End-to-End Pipeline

This project demonstrates an end-to-end data pipeline using Azure Databricks, Azure Data Lake, and Azure Synapse Analytics . It showcases how to ingest, transform, and analyze data efficiently using modern cloud-based tools.

๐Ÿš€ Project Overview

The pipeline includes:

  • Data ingestion from raw sources (api sample data included in github data folder) into Azure Data Lake using Azure Data Factory
  • Data transformation using Azure Databricks (PySpark notebooks)
  • Data loading into Azure Synapse Analytics for reporting and analysis
  • Integration with Azure Data Factory for orchestration

๐Ÿ› ๏ธ Technologies Used

  • Azure Databricks
  • Azure Synapse Analytics
  • Azure Data Lake Storage Gen2
  • Azure Data Factory
  • PySpark
  • SQL

Data ingestion

  • Data ingestion into Azure Data Lake using Azure Data Factory
  • image

๐Ÿงฉ Architecture Diagram

Archtecture Diagram