Recommendation Engine (Movie Recommender System)

The Movie Recommendation System is a Streamlit-based web application that helps users discover movies they might enjoy.
It uses content-based filtering powered by machine learning to recommend movies similar to a user’s selection.

📌 Table of Contents

Overview
Project Workflow
Business Problem
Ingestion Script
Tools & Technologies
Project Structure
Data Pipeline Overview
App Preview
Key Outcomes
How to Run This Project
License

Overview

The Movie Recommendation System is a Streamlit-based web application designed to help users discover movies they might enjoy.
It uses content-based filtering powered by machine learning to recommend movies similar to a user’s selection.

Utilized TF-IDF Vectorization and cosine similarity to match movie metadata for similarity scoring.

Deployed as a web application using Streamlit for interactive user experience.

✨ Key Features

Personalized movie recommendations
Movie posters for a visual preview
Ratings and overviews
Direct YouTube trailer links
Clean, interactive Streamlit interface

Project Workflow

Load preprocessed movie metadata (movie_dict.pkl) and similarity matrix (similarity.pkl).
User selects a movie from the dropdown.
The system calculates the top 5 most similar movies.
Fetch movie posters, ratings, and overviews using TMDb API.
Display results in a visually appealing interface.

Business Problem

With thousands of movies released every year, users face information overload and often struggle to pick what to watch next.
This project solves that by providing personalized movie suggestions based on similarity, enhancing user experience, and boosting content discovery.

Ingestion Script

Here’s a simple example to generate your own movie_dict.pkl and similarity.pkl files using a movie dataset:

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import pickle

# Load your dataset
movies = pd.read_csv("movies.csv")

# Combine textual features into a single 'tags' column
movies['tags'] = movies['overview'] + movies['genres'] + movies['keywords']

# Convert text data to feature vectors
cv = CountVectorizer(max_features=5000, stop_words='english')
vectors = cv.fit_transform(movies['tags']).toarray()

# Compute similarity
similarity = cosine_similarity(vectors)

# Save the data
pickle.dump(movies.to_dict(), open('movie_dict.pkl', 'wb'))
pickle.dump(similarity, open('similarity.pkl', 'wb'))

Tools & Technologies

Tool	Purpose
Python	Core programming language
Streamlit	Web app development
TMDb API	Fetches posters, ratings, and trailers
Pandas	Data manipulation
Pickle	Data serialization
Scikit-learn	Similarity computation

Project Structure

Movie-Recommender-System/
│
├── app.py                # Main Streamlit app
├── movie_dict.pkl        # Movie metadata file
├── similarity.pkl        # Precomputed similarity matrix
├── requirements.txt      # Python dependencies
└── README.md             # Project documentation

Data Pipeline Overview

Step	Description
1. Data Collection	Gather movie metadata such as titles, genres, keywords, and overviews.
2. Data Preprocessing	Clean and merge textual columns (overview, genres, keywords, etc.) to create a single feature column.
3. Feature Extraction	Convert combined text data into numerical vectors using `CountVectorizer`.
4. Similarity Calculation	Compute cosine similarity between movie vectors to identify similar movies.
5. Deployment	Integrate the model with Streamlit UI and TMDb API for real-time movie recommendations.

App Preview

Key Outcomes

Outcome	Description
End-to-End Web App	Developed a complete movie recommendation system from data preprocessing to deployment.
TMDb API Integration	Integrated the TMDb API to fetch real-time movie posters, ratings, and trailers.
Cosine Similarity Model	Implemented content-based filtering using cosine similarity for accurate recommendations.
Interactive Streamlit UI	Designed a user-friendly interface with dynamic elements for enhanced user experience.

How to Run This Project

Prerequisites⬇️

Install Python (version 3.7 or later).
Install required Python libraries:
```
 pip install streamlit pandas requests
```

Setup⬇️

Prepare Data:
Since movie_dict.pkl and similarity.pkl are not provided, you need to generate them:
- The movie_dict.pkl file should contain movie metadata (e.g., movie IDs, titles, etc.).
- The similarity.pkl file should be a precomputed similarity matrix.
- Use your dataset and appropriate Python libraries to create these files.

Clone the Repository:

git clone https://github.com/your-username/Movie-Recommender-System.git
cd Movie-Recommender-System

Add the Required Files:
Place the generated movie_dict.pkl and similarity.pkl files in the project directory.
Run the Application:
```
 streamlit run app.py
```

API Integration⬇️

The app uses the TMDb API for fetching movie details.
Replace your TMDb API key inside the code:
`

  `api_key = "YOUR_TMDB_API_KEY"

Contributions

Contributions are welcome! Feel free to fork this repository, make improvements, and submit pull requests.
Together, let's make this recommendation system even more powerful and versatile.

License

This project is licensed under the MIT License
2025 Faisal Khan

If you like this project don’t forget to 🌟(star) the repository and Clone this repository.

Faisal-khann/Recommendation_Engine