José María Luna
josemarialuna
PhD in Computer Science and AI at the University of Seville, Spain. #Python #Scala #Spark #DataScientist
Languages
Top Repositories
Clustering Validity Index based on Chi Square as Python package
This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
📊 Python tool for creating datasets with clusters using a normal distribution. Customize clusters, significant columns, and add variability with dummy columns. Ideal for testing clustering algorithms.
This package contains the code for generating Big Data random datasets in Spark.
Repositories
18Entorno de Big Data basado en Docker con Hadoop, Hive, Trino, Kafka y Airflow. Incluye configuraciones, scripts y ejemplos de MapReduce para análisis de datos distribuidos.
Clustering Validity Index based on Chi Square as Python package
This package contains the code for executing clustering validity indices in Spark. The package includes BD-Silhouette, BD-Dunn, Davies-Bouldin and WSSSE indices.
This package contains the code for calculating external clustering validity indices in Spark. The package includes Chi Index among others.
📊 Python tool for creating datasets with clusters using a normal distribution. Customize clusters, significant columns, and add variability with dummy columns. Ideal for testing clustering algorithms.
This repository contains the collection of UCI (real-life) datasets and Synthetic (artificial) datasets (with cluster labels and MATLAB files) ready to use with clustering algorithms.
Clustering
This package contains the code for generating Big Data random datasets in Spark.
No description provided.
Friedman tests for comparing multiple methods across datasets in python
No description provided.
This package contains the code for executing clustering validity indices in Java by using K-means from Weka. The package includes the following clustering validity indices: Silhouette, Dunn, BD-Silhouette, BD-Dunn, Davies-Bouldin, Calinski-Harabasz, MaximumDiameter, SquaredDistance, AverageDistance, AverageBetweenClusterDistance, MinimumDistance.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.