NhanPhamThanh-IT/Linear-Lasso-Regression-Car-Price-Prediction
š Predict car prices instantly with Linear & Lasso Regression! Built with Streamlit, scikit-learn, pandas & matplotlib. Compare models, explore data, and learn ML hands-on. Fast, open source, and easy to use for students & developers!
Linear-Lasso-Regression-Car-Price-Prediction
Overview
Linear-Lasso-Regression-Car-Price-Prediction is an interactive web application that predicts the price of a car (or house) using machine learning models: Linear Regression and Lasso Regression. Built with Streamlit, it provides an intuitive interface for users to input car features and instantly get price predictions using two different regression models.
- Author: Nhan Pham
- Email: ptnhanit230104@gmail.com
- Version: 1.0.0
- Created: 2025-07-26
Visual Workflow
flowchart TD
A[User Inputs Car Features] --> B{Choose Model}
B -- Linear Regression --> C[Preprocess Features]
B -- Lasso Regression --> C
C --> D[Load Model]
D --> E[Predict Price]
E --> F[Display Result in UI]
Features
- User-friendly Web UI: Built with Streamlit for easy interaction.
- Dual Model Support: Predict prices using both Linear Regression and Lasso Regression.
- Instant Results: Get predictions in real-time as you input data.
- Educational: Explore and compare the effects of different regression techniques.
- Extensible: Modular codebase for easy extension to other regression models or datasets.
- Open Source: Freely available for learning, research, and extension.
- Well-documented: Includes guides for dataset, models, and Streamlit usage.
- Actively Maintained: Issues and PRs are welcome!
Table of Contents
- Overview
- Visual Workflow
- Features
- Demo
- How It Works
- Example Prediction
- Technical Stack
- Dataset
- Model Training
- Installation
- Usage
- Advanced Usage
- Project Structure
- Extending the Project
- Troubleshooting & FAQ
- Contributing
- Changelog
- Learning Resources
- Requirements
- License
- Acknowledgements
- Contact
How It Works
- User Input: Enter car features (year, price, kms driven, fuel type, seller type, transmission, owner).
- Model Selection: Choose between Linear Regression or Lasso Regression.
- Preprocessing: The app encodes categorical features and prepares the input for the model.
- Prediction: The app loads the selected model (from
.pklfiles) and predicts the price. - Result Display: The predicted price is shown instantly in the UI.
- Model Comparison: Users can easily compare predictions from both models to understand the impact of regularization.
Example Prediction
Suppose you want to predict the price of a car with the following features:
| Feature | Value |
|---|---|
| Year | 2018 |
| Present Price | 500000 |
| Kms Driven | 30000 |
| Fuel Type | Petrol |
| Seller Type | Individual |
| Transmission | Manual |
| Owner | 1 |
- Step 1: Enter these values in the app fields.
- Step 2: Click With Linear model or With Lasso model.
- Step 3: The app will display something like:
Predicted Price: $420,000.00
Try changing the model or input values to see how the prediction changes!
Technical Stack
- Frontend/UI: Streamlit (Python-based web app framework)
- Backend/ML: scikit-learn (for model training and inference)
- Data Handling: pandas
- Visualization: matplotlib (for EDA and model training)
- Serialization: pickle (for saving/loading models)
- Jupyter Notebook: For model development and experimentation
Dataset
- Source: See
dataset/car_data.csv - Description: Contains 301 rows and 9 columns, including features like
Year,Present_Price,Kms_Driven,Fuel_Type,Seller_Type,Transmission,Owner, and the targetSelling_Price. - Details:
- No missing values
- Categorical features are encoded for model compatibility
- For more, see
docs/dataset.md
- Sample Columns:
Year: Year of manufacturePresent_Price: Current ex-showroom priceKms_Driven: Kilometers drivenFuel_Type: Petrol/Diesel/CNGSeller_Type: Dealer/IndividualTransmission: Manual/AutomaticOwner: Number of previous ownersSelling_Price: Price at which the car was sold (target)
Model Training
- Notebook:
model/model-training.ipynb - Models:
linear_model.pkl: Trained Linear Regression modellasso_model.pkl: Trained Lasso Regression model
- Workflow:
- Data loaded and cleaned
- Categorical features encoded
- Data split into train/test sets
- Models trained and evaluated
- Best models saved as
.pklfiles
- Learn more:
Installation
1. Clone the Repository
git clone https://github.com/NhanPhamThanh-IT/Linear-Lasso-Regression-Car-Price-Prediction.git
cd Linear-Lasso-Regression-Car-Price-Prediction2. Install Dependencies
Create a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activateInstall required packages:
pip install streamlit scikit-learn pandas matplotlib3. (Optional) Retrain Models
If you want to retrain the models, open and run model/model-training.ipynb in Jupyter Notebook or VSCode.
Usage
1. Run the App
streamlit run app/main.pyThe app will open in your browser at http://localhost:8501.
2. Using the App
- Fill in the car features in the input fields
- Click With Linear model or With Lasso model to get a prediction
- The predicted price will be displayed instantly
- Try different inputs and compare model results
Advanced Usage
- Custom Port: Run on a different port:
streamlit run app/main.py --server.port 8502
- Headless Mode: For deployment:
streamlit run app/main.py --server.headless true - Docker: See
docs/streamlit.mdfor Docker deployment instructions. - Streamlit Cloud: Deploy directly from GitHub for free.
- API Integration: (Advanced) Wrap prediction logic in a FastAPI or Flask API for programmatic access.
Project Structure
Linear-Lasso-Regression-Car-Price-Prediction/
āāā app/
ā āāā main.py # Streamlit app entry point
ā āāā predictor.py # Model loading and prediction logic
ā āāā ui.py # Streamlit UI components
āāā dataset/
ā āāā car_data.csv # Car price dataset
āāā docs/
ā āāā dataset.md # Dataset documentation
ā āāā lasso-regression-model.md # Lasso regression theory & practice
ā āāā linear-regression-model.md # Linear regression theory & practice
ā āāā streamlit.md # Streamlit learning guide
āāā model/
ā āāā lasso_model.pkl # Trained Lasso model
ā āāā linear_model.pkl # Trained Linear model
ā āāā model-training.ipynb # Model training notebook
āāā LICENSE
āāā README.md
Extending the Project
Want to add new features or support more models? Here are some ideas:
- Add More Regression Models: Integrate Ridge, ElasticNet, or custom models.
- Feature Engineering: Add new input features or transformations.
- Visualization: Show feature importance, residuals, or model diagnostics in the UI.
- API Integration: Expose predictions via a REST API for integration with other apps.
- Deployment: Deploy to Streamlit Cloud, Heroku, or Docker (see
docs/streamlit.md). - UI Enhancements: Add charts, explanations, or user authentication.
How to add a new model:
- Train and save your model as a
.pklfile. - Update
app/predictor.pyto load and use the new model. - Add a button or option in
app/ui.pyfor users to select the new model. - Update the README and docs as needed.
Troubleshooting & FAQ
Q: The app doesn't start or crashes on launch.
- Make sure all dependencies are installed (
pip install streamlit scikit-learn pandas matplotlib). - Check your Python version (should be 3.7+).
- Ensure you are running the command from the project root directory.
Q: I get a ModuleNotFoundError for 'streamlit' or 'sklearn'.
- Activate your virtual environment if you created one.
- Run
pip install -r requirements.txtif you have a requirements file.
Q: The prediction is always the same or seems off.
- Check that the input features are reasonable and within the expected range.
- Make sure the model
.pklfiles are present in themodel/directory. - Retrain the models if needed using the provided notebook.
Q: How do I deploy this app online?
- See deployment instructions in
docs/streamlit.mdfor Streamlit Cloud, Heroku, and Docker.
Q: Can I use my own dataset?
- Yes! Replace
dataset/car_data.csvwith your own data (matching the expected columns), retrain the models, and update the code as needed.
Contributing
Contributions are welcome! To contribute:
- Fork the repository
- Create a new branch (
git checkout -b feature/your-feature) - Make your changes and commit them
- Push to your fork (
git push origin feature/your-feature) - Open a Pull Request describing your changes
Guidelines:
- Write clear, descriptive commit messages
- Document new features or changes in the README/docs
- Follow PEP8 style for Python code
- Add tests or example usage if possible
- Please read our Code of Conduct before contributing.
Changelog
See CHANGELOG.md for a list of major changes, new features, and bug fixes.
Learning Resources
- Streamlit Guide
- Linear Regression Guide
- Lasso Regression Guide
- Dataset Guide
- scikit-learn Documentation
- Streamlit Documentation
- Pandas Documentation
Requirements
- Python 3.7+
- streamlit
- scikit-learn
- pandas
- matplotlib
You can also create a requirements.txt file for easy installation:
streamlit
scikit-learn
pandas
matplotlib
License
This project is licensed under the MIT License.
Acknowledgements
- Streamlit
- scikit-learn
- pandas
- matplotlib
- UCI Machine Learning Repository (for dataset inspiration)
Contact
For questions, suggestions, or contributions, please contact:
- Nhan Pham
- Email: ptnhanit230104@gmail.com