sumit-nimbale/CODSOFT
Machine Learning Internship Projects for CodSoft (December Batch). Includes text classification, fraud detection, churn prediction, and other ML tasks using Python and Scikit-Learn.
CodSoft Machine Learning Internship Projects
-This repository contains three machine learning projects completed as part of the CodSoft Machine Learning Internship (December Batch).
-The projects focus on classical machine learning fundamentals, correct problem formulation, and appropriate evaluation metrics using Python and scikit-learn.
-The objective is learning and clarity of implementation rather than production deployment.
Repository Overview
-Three independent machine learning tasks
-Text and binary classification problems
-Emphasis on data preprocessing and evaluation
-Handling of imbalanced datasets
-Clean, reproducible project structure
Tasks Included
-Task 1 — Movie Genre Classification
-Text classification using movie descriptions
-Feature extraction with TF-IDF
-Multi-class model evaluation
-Folder: Task 1 - Movie Genre Classification/
-Task 2 — Credit Card Fraud Detection
-Binary classification on highly imbalanced data
-Focus on precision, recall, and false negatives
-Evaluation beyond accuracy
-Folder: Task_2_Credit_card_fraud_detection/
-Task 3 — SMS Spam Detection
-Binary text classification
-NLP preprocessing and TF-IDF
-Model comparison and selection
-Folder: Task_3_SMS_Spam_Detection/
Models Used
-Logistic Regression
-Naive Baye
-Decision Trees
-Support Vector Machine
-Random Forest (where applicable)
Evaluation Metrics
-Confusion Matrix
-Precision
-Recall
-F1-Score
Metrics are selected based on problem characteristics, especially for imbalanced datasets.
Repository Structure
CODSOFT/
├── Task 1 - Movie Genre Classification/
├── Task_2_Credit_card_fraud_detection/
├── Task_3_SMS_Spam_Detection/
└── README.md
Tools and Libraries
-Python
-NumPy
-Pandas
-scikit-learn
-Matplotlib
Internship Outcome
-This internship provided hands-on experience in:
-End-to-end machine learning workflows
-NLP-based classification tasks
-Model evaluation and comparison
-Writing clean, well-documented machine learning code