4 results for “topic:data-duplication”
A Python toolkit for analyzing machine learning models and datasets.
Final Year Project as Deletion of Duplicated data using Machine learning project with source code and Report.
Data quality analysis of DermaMNIST (MedMNIST), HAM10000, and Fitzpatrick17k datasets
A powerful machine learning based tool for detecting, analyzing, and removing duplicates in CSV datasets. Includes text similarity detection, numeric near-duplicate clustering, ML classification, visual analytics, and data cleaning. Features both Streamlit and Flask apps with ngrok support for easy deployment.