bugnaigarmatqwgq/gumtree-property-scraper-auto-filters-duplicates
Gumtree property listings scraper
Gumtree Property Scraper Auto Filters Duplicates
A high-performance Gumtree property scraper that collects rental and property listings with automatic duplicate filtering. It helps investors, sourcers, and analysts monitor prices, compare listings, and identify opportunities faster using structured, up-to-date data.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for gumtree-property-scraper-auto-filters-duplicates you've just found your team — Let’s Chat. 👆👆
Introduction
This project extracts detailed property listings from Gumtree based on search URLs and applied filters. It eliminates repetitive manual browsing and keeps your dataset clean by automatically handling duplicates across runs.
It is designed for property investors, rent-to-rent sourcers, and data-driven real estate professionals.
Built for Property Data Intelligence
- Collects structured property data from Gumtree search results
- Automatically filters out duplicate listings across runs
- Normalizes weekly and monthly pricing for easy comparison
- Supports continuous monitoring of new listings over time
Features
| Feature | Description |
|---|---|
| Unlimited Runs | Collect property listings without artificial run limits. |
| Automatic Deduplication | Skips previously seen listings using IDs and URLs. |
| Price Normalization | Converts weekly prices to monthly for consistency. |
| Rich Property Fields | Captures pricing, location, seller, and descriptions. |
| Multi-URL Support | Run multiple saved search URLs in a single execution. |
| Timestamped Records | Every listing includes a scrape timestamp. |
What Data This Scraper Extracts
| Field Name | Field Description |
|---|---|
| title | Property listing title. |
| price | Displayed price with currency symbol. |
| price_value | Numeric price value only. |
| price_frequency | Price frequency such as pm or pw. |
| location | Area or city where the property is located. |
| bedrooms | Number of bedrooms listed. |
| property_type | Type of property (flat, house, studio, etc.). |
| date_available | Listed move-in availability date. |
| seller_type | Indicates private or commercial seller. |
| seller_name | Name of the seller or landlord. |
| phone_number | Contact number if publicly visible. |
| description | Full property description text. |
| images | Array of image URLs from the listing. |
| listing_id | Unique identifier for the listing. |
| url | Direct URL to the property listing. |
| date_posted | Date when the listing was published. |
| scraped_at | Timestamp when the data was collected. |
Example Output
[
{
"title": "Self Contained Studios to Rent in Tooting Broadway",
"price": "£1550",
"price_value": 1550,
"price_frequency": "pm",
"location": "Tooting Broadway, London",
"bedrooms": 1,
"property_type": "Flat",
"date_available": "26 Jun 2025",
"seller_type": "Private",
"seller_name": "Frank",
"listing_id": "1499893351",
"url": "https://www.gumtree.com/p/property-to-rent/example",
"description": "Prime Tooting location, close to transport and amenities.",
"images": [
"https://imagedelivery.net/example1.jpg",
"https://imagedelivery.net/example2.jpg"
],
"scraped_at": "2025-07-06 02:01:05"
}
]
Directory Structure Tree
Gumtree Property Scraper | Auto filters Duplicates/
├── src/
│ ├── runner.py
│ ├── extractors/
│ │ ├── listing_parser.py
│ │ ├── price_utils.py
│ │ └── deduplication.py
│ ├── config/
│ │ └── settings.example.json
│ └── outputs/
│ └── exporters.py
├── data/
│ ├── inputs.sample.txt
│ └── sample_output.json
├── requirements.txt
└── README.md
Use Cases
- Property investors use it to track rental prices, so they can identify undervalued deals early.
- Rent-to-rent sourcers use it to monitor new listings, so they can contact landlords faster than competitors.
- Market analysts use it to study pricing trends, so they can support data-backed investment decisions.
- Agencies use it to build internal property databases, so they can streamline sourcing workflows.
FAQs
Does this tool handle duplicate listings automatically?
Yes. Listings are tracked using unique identifiers and URLs, ensuring previously collected properties are excluded by default.
Can I run it regularly to monitor new listings?
Yes. It is suitable for scheduled executions and only new or updated listings will appear in results.
Are weekly prices converted automatically?
Weekly prices are normalized to monthly values to keep all pricing consistent and comparable.
What formats can I use for the output data?
The extracted data can be exported in structured formats suitable for spreadsheets, databases, or analytics tools.
Performance Benchmarks and Results
Primary Metric: Processes hundreds of listings per minute depending on filter complexity.
Reliability Metric: Maintains a high success rate with stable extraction across repeated runs.
Efficiency Metric: Optimized duplicate handling reduces redundant processing and storage.
Quality Metric: Consistently delivers complete property records with normalized pricing and timestamps.
