ImranDevPython/zillow-property-scraper
A powerfull zillow properties scraper
Zillow Property Scraper
A powerful web scraper for Zillow real estate listings that effectively handles anti-bot measures and dynamic content loading. Built with Python, this tool successfully navigates Zillow's complex website structure while maintaining a low detection profile.
Screenshots
Scraped properties
Scraped properties in CSV & JSON format
Features
- ๐ก๏ธ Advanced anti-bot detection avoidance
- ๐ Handles dynamic content and lazy loading
- ๐ Comprehensive property data extraction
- ๐พ Multiple export formats (CSV & JSON)
- ๐ Asynchronous operation for better performance
- ๐ฑ Responsive to different page states and loading conditions
- ๐ Smart pagination handling
- ๐ Real-time progress tracking
Installation
Clone the repository
git clone https://github.com/ImranDevPython/zillow-property-scraper.git
Navigate to project directory
cd zillow-property-scraper
Install required packages
pip install -r requirements.txtUsage
Run the scraper using:
python main.pyThe CLI will guide you through:
- Entering location (city and state)
- Selecting number of pages to scrape
- Choosing export format (CSV/JSON)
How It Works
The scraper implements several sophisticated techniques to handle Zillow's anti-bot measures:
- Dynamic wait times between requests
- Natural scrolling behavior simulation
- Automatic handling of lazy-loaded content
- Smart detection of page load completion
- Automatic retry mechanisms for failed requests
- Browser fingerprint randomization
Data Output
The scraper collects comprehensive property information including:
- Full address
- Price
- Number of bedrooms
- Number of bathrooms
- Square footage
Data can be exported in both CSV and JSON formats, with automatic file naming based on location and timestamp.
Anti-Detection Measures
- Random delays between actions
- Smooth scrolling simulation
- Dynamic user-agent management
- Intelligent request throttling
- Automatic session management
- Smart pagination handling
Requirements
- Python 3.7+
- DrissionPage
- Rich (for CLI interface)
- AsyncIO
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Author
Disclaimer
This tool is for educational purposes only. Please review Zillow's terms of service before use. Be responsible and respectful with scraping frequency.