GitHunt
M3

m30m/embedding-biases

A simple project to test whether your embedding is biased toward specific groups or not

Embedding Biases

This project is just a visualization for this paper.

You can test arbitrary word groups to see if they are related to each other in any biased way or not.

Installation

Install the dependencies:

pip3 install -r requirements.txt

You can run the server in three modes:

In-memory Usage

You can use the in-memory option for storing the vectors embeddings. This is good if you just want to try out this app:

python3 embedding_biases.py --vectors small.txt

Sqlite Usage

This option use an already populated sqlite database for getting word embeddings.

First you should create and fill the sqlite db:

python3 embedding_biases.py --storage sqlite --sqlite-path sqlite.db --fill-db

And then you can start the server:

python3 embedding_biases.py --storage sqlite --sqlite-path sqlite.db

Redis Usage

You can use redis instead of sqlite for faster access.
You can easily spawn a redis container via the following commands:

apt install docker.io
docker run -d --name embedding-redis -p 127.0.0.1:6379:6379 redis

Use the following command for filling up redis:

python3 embedding_biases.py --storage redis --redis-url redis://127.0.0.1:6379/ --fill-db

And then you can start the server:

python3 embedding_biases.py --storage redis --redis-url redis://127.0.0.1:6379/

Languages

HTML52.5%Python47.5%

Contributors

MIT License
Created May 19, 2018
Updated April 24, 2019
m30m/embedding-biases | GitHunt