GitHunt
SH

shervinnd/Forest-Cover-Type-Classifier

๐ŸŒณ This Jupyter Notebook trains a neural network on the Covertype dataset to predict forest cover types using TensorFlow. It includes data preprocessing, model building, training, evaluation, random sample testing, and ROC curve visualization. Achieves ~85% accuracy on multi-class classification of 7 cover types! ๐Ÿš€๐Ÿ“Š

๐ŸŒŸ Forest Cover Type Prediction with Neural Networks ๐ŸŒฒ

Welcome to the Forest Cover Type Classifier repository! This project
uses a TensorFlow-based neural network to predict forest cover types
from the UCI Covertype dataset. ๐ŸŒณ๐Ÿ” Whether you're into machine
learning, environmental data, or just love forests, this notebook has
you covered! We preprocess data, train a multi-layer perceptron,
evaluate performance, test random samples, and plot ROC curves for
insightful analysis. ๐Ÿš€

๐Ÿ“‹ Project Overview

  • Dataset: UCI Forest Covertype (581,012 samples, 54 features, 7
    classes like Spruce/Fir, Lodgepole Pine, etc.) ๐ŸŒฟ
  • Model: Sequential DNN with Dense layers, Dropout for
    regularization, and Softmax output. Trained with Adam optimizer and
    categorical cross-entropy. ๐Ÿง 
  • Key Features:
    • Data splitting & standardization ๐Ÿ“Š
    • Batch training with TensorFlow Datasets โšก
    • Accuracy evaluation (~85% on test set) โœ…
    • Random sample prediction testing ๐ŸŽฒ
    • Multi-class ROC curve visualization ๐Ÿ“ˆ
  • Tech Stack: TensorFlow, Scikit-learn, Matplotlib, NumPy ๐Ÿ› ๏ธ

๐Ÿ› ๏ธ Installation

  1. Clone the repo:

    git clone https://github.com/shervinnd/forest-cover-type-classifier.git
    cd forest-cover-type-classifier
  2. Install dependencies (use a virtual environment like venv or conda):

    pip install tensorflow numpy matplotlib scikit-learn
  3. Open the Jupyter Notebook:

    jupyter notebook covtype.ipynb

    Note: This was tested on Python 3.12 with GPU acceleration (T4).
    Ensure TensorFlow is GPU-enabled if needed! โš™๏ธ

๐Ÿš€ Usage

  1. Run the Notebook: Execute cells sequentially to:
    • Import libraries ๐Ÿ“š
    • Load & preprocess data (fetch_covtype, scaling, one-hot
      encoding) ๐Ÿ”„
    • Build & compile the model ๐Ÿ—๏ธ
    • Train for 20 epochs (batch size 128) โฑ๏ธ
    • Evaluate on test set ๐Ÿ“‰
    • Test a random sample ๐ŸŽฏ
    • Generate ROC curves for each class ๐Ÿ“Š
  2. Customize:
    • Tweak hyperparameters like epochs, batch size, or layers in the
      parameters cell. ๐Ÿ”ง
    • Run test_random_sample() multiple times for fun predictions!
      ๐Ÿ˜„
  3. Output Example:
    • Training logs show accuracy improving to ~81% on train, ~85%
      on validation.
    • ROC AUCs: High for most classes (e.g., 0.99+ for some)! ๐ŸŒŸ

๐Ÿ“Š Results & Insights

  • Test Accuracy: ~85.23% ๐ŸŽ‰
  • Sample Prediction: Picks a random test instance, predicts cover
    type (e.g., Lodgepole Pine), shows probabilities, and checks
    correctness. โœ…/โŒ
  • ROC Curves: Visualizes model confidence per class -- great for
    multi-class imbalance analysis! ๐Ÿ“ˆ (Plotted with Matplotlib)
  • Pro Tip: Classes like Cottonwood/Willow might have lower AUC due to
    fewer samples. Experiment with oversampling! โš–๏ธ

๐Ÿค Contributing

We'd love your input! ๐ŸŒ

  • Fork the repo & create a pull request.
  • Suggestions: Add more models (e.g., CNNs, XGBoost), hyperparameter
    tuning with Keras Tuner, or deployment with Streamlit. ๐Ÿ’ก
  • Report issues or bugs via GitHub Issues. ๐Ÿ›

๐Ÿ“„ License

This project is licensed under the MIT License -- feel free to use,
modify, and share! ๐Ÿ“œ

Powered by Miracle โšก -- Exploring forests one prediction at a time!
๐ŸŒฒ