aaronkaplan/torexitnodes_simple
Simple version of the tor exit node list DB. Part of the Internet Inventory project.
Tracking Tor Exit Nodes
These set of scripts fetch data on Tor exit nodes from the only - as of now - currently known valid and authoritative source: https://check.torproject.org/exit-addresses
Upon downloading the data, it gets inserted into a DB (directory: 'db/')
There is a web interface written in python+flask in 'www/'
Directory Contents
www/ -> website stuff
db/ DB structure and initial data import mechanisms
data/ data directory
Prerequisites
- wget
- postgresql 9.3 or higher
- python
- flask
- webserver such as nginx
See the requirements.txt file in www/tor
Initial setup
$ cd db
$ sudo su
# su - postgresql
$ createuser -s userename
$ psql template1 < db.sqlTesting if it works
- Activate any virtualenv or conda environment in case you use that to install the prerequisites.
- Test if fetching the data works:
First, make sure that the newly created user tordb may access the tables and the DB and add it to the postgresql pg_hba.conf file.
$ ./fetch-tor-list.sh
psql -U tordb tordb_simple
select count(*) from nodeyou should see a non-zero result.
If it works, you can continue to run this automatically...
The execution of fetch-tor-list.sh is expected to output a lot of error messages like this:
ERROR: duplicate key value violates unique constraint "idx_node_combined"
DETAIL: Key (node_id, ip, exit_address_ts, id_nodetype)=(0011BD2485AD45D984EC4159C88FC066E5E3300E, 162.247.74.201, 2019-08-08 09:12:18+02, 1) already exists.
They can/should be ignored.
How to get this to run automatically?
$ crontab -l
(...)
# fetch the list once a day at 1:05 A.M.
# m h dom mon dow command
5 01 * * * ( cd /home/your_user/torexitnodes_simple; source venv/bin/activate ; ./fetch-tor-list.sh >/dev/null 2>&1 ) (Note that this assumes you installed the prerequisites via virtual-env).
Deploying the web interface properly
The built-in webserver of flask is a no-no for production environments.
Hence, please follow the great documentation on production setups.
The instructions vary depending on which web server you use.