Repos
9
Stars
3
Forks
0
Top Language
Python
Loading contributions...
Top Repositories
Processing NASA webserver logs with pyspark on AWS EMR. Tableau dashboard is created to generate insights from processed data.
This is a Database Management System enabling students to apply for jobs posted by employers and events posted by University faculty. This system is developed using Java with SQL Server as the backend database system
Job Scraping pipeline deployed in AWS with data in Neo4j graph database which is enabled using flask api and fed to end user through basic UI
Fraud detection using machine learning for determining whether a payment is legitimate or fraud. A Webapp is created in Streamlit for changing hyperparameters and comparing different models
Tracking location of a Vehicle using Kafka Streaming and outputting data on browser using leaflet.js
Performed EDA and feature engineering on flight dataset consisting of 5.8 million records. Developed classification and regression models to predict flight delays and cancellations from various features by building PySpark pipelines.
Repositories
9This is a Database Management System enabling students to apply for jobs posted by employers and events posted by University faculty. This system is developed using Java with SQL Server as the backend database system
Processing NASA webserver logs with pyspark on AWS EMR. Tableau dashboard is created to generate insights from processed data.
Job Scraping pipeline deployed in AWS with data in Neo4j graph database which is enabled using flask api and fed to end user through basic UI
Fraud detection using machine learning for determining whether a payment is legitimate or fraud. A Webapp is created in Streamlit for changing hyperparameters and comparing different models
Tracking location of a Vehicle using Kafka Streaming and outputting data on browser using leaflet.js
Performed EDA and feature engineering on flight dataset consisting of 5.8 million records. Developed classification and regression models to predict flight delays and cancellations from various features by building PySpark pipelines.
An Awesome List of Open-Source Data Engineering Projects
A curated list of data engineering tools for software developers
No description provided.