dtungpka/shopee-scraper
shopee-scraper is a Python-based web scraping tool designed to extract product data and reviews from Shopee's e-commerce platform
Shopee Scraper
This Python application scrapes product data from Shopee. It retrieves basic product information (name, link, price, etc.), and can also collect product reviews in different modes.
Installation Steps
-
Clone the Repository:
git clone https://github.com/dtungpka/shopee-scraper.git cd shopee-scraper -
Create a Virtual Environment (Optional But Recommended):
- Linux/Mac:
python3 -m venv venv source venv/bin/activate - Windows:
python -m venv venv venv\Scripts\activate
- Linux/Mac:
-
Install Dependencies:
pip install -r requirements.txt
Usage
Before running:
- Make sure all Chrome windows are fully closed.
- Prepare to log in and solve captchas manually if prompted.
Basic Command
python src/retriv.py -k "your_search_term" -n 10 -r 30When you see the search page loaded in the browser:
- Log in to Shopee (if needed).
- Solve any captcha presented.
- After continuing to the main search page, press Enter in the terminal to proceed.
- Keep an eye on the browser; if another captcha appears at any point, solve it to continue scraping.
Scraping Modes
-
Review Limit Mode:
- Use
-ror--review-limitto collect reviews from 5-stars downwards until the limit is met:python src/retriv.py -k "laptop" -n 5 -r 10
This collects up to 10 reviews per product, starting from the top ratings and moving downward.
- Use
-
All-Star Types Mode:
- Combine
--all-star-typeswith--star-limit-per-typeto specify how many reviews to retrieve for each star rating:python src/retriv.py -k "laptop" -n 5 --all-star-types --star-limit-per-type 5
This collects 5 reviews for 5-star, 4-star, 3-star, etc., in separate queries.
- Combine
Command-Line Arguments
-k,--keyword: Search term (default: "Raspberry pi")-n,--num: Number of products to retrieve (default: 10)-r,--review-limit: Total reviews to collect per product (default: 30)--index-only: If set, only retrieve index data without details--all-star-types: Collect each star rating separately--star-limit-per-type: Reviews per star type (default: 10)--chrome-user-data-dir: Path to your Chrome profile directory
Example Command
python src/retriv.py -k "laptop" -n 5 --all-star-types --star-limit-per-type 3License
This project is licensed under the MIT License. See the LICENSE file for details.