ER
ericblam/competition-scraper
Program to scrape and collect data from ballroom competitions.
Ballroom Competition Scraper
This system is meant to gather data from websites such as O2CM about ballroom competitions for analyses of results and placements.
Make sure that you have the following packages installed:
- libpq-dev
- postgresql
- python3-tidylib
Make sure that you have the following python packages installed:
- urllib3
- pytidylib
- bs4
- psycopg2
Additionally, you will need a database running with the configurations specified in src/config.json and the specified database loaded with the SQL files in src/db.
The repository is broken up into several packages:
- .
- crawler.py
- Script that creates threads that run the webparsing framework
- crawler.py
- db
- Accessors and objects for competition database objects
- test
- unittest package for testing code in the repository
- util
- webparser
- single-page parsers, and other webparsing framework code
To run the crawler, you simply would need to run:
$ python3 crawler.py --numWorkers <number-of-threads> --configFile <config-file-path> http://results.o2cm.com
On this page
Languages
Python94.4%PLpgSQL4.5%Shell1.1%
Contributors
MIT License
Created June 30, 2017
Updated August 4, 2025