11 results for “topic:apache-spark-cluster”
A command-line tool for launching Apache Spark clusters.
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
This project has customization likes custom data sources, plugins written for the distributed systems like Apache Spark, Apache Ignite etc
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
Apache Spark standalone cluster with JupyterLab on Docker. Local development and multi-worker setup ready.
This is a project that aims to do distributed analytics using clusters using a spatial dataset. Our goal with this project was to analyze the impact of single family rresidential zoning in the US and correlate it to quality of life measures in an effort to dissuade a segregation of zoning types and promote inclusivity.
This respository contains projects made for the Large Scale Data Analysis course at the AGH UST in 2024.
Implementations of Markov Clustrer Algorithm (MCL) and Regularized Markov Cluster Algorithm (R-MCL) in Apache Spark
data enginerring project - visualize visa numbers by country, time issued from japan
Successfully established a machine learning model using PySpark which can accurately classify whether a bank customer will churn or not up to an accuracy of more than 86% on the test set.
Apache Spark cluster lab.