Test machine specs
Model: Lenovo Thinkpad T470p
CPU: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz (4 cores + hyperthreading)
RAM: 16G
Storage: SSD
How to
-
Download datasets using
retrieve_datasets.shscript. Be
patient as it may take a lot of time -
Download and run grobid server
.. code-block:: console
user@pc ~ $ git clone git@github.com:kermitt2/grobid
user@pc ~ $ cd grobid
user@pc ~/grobid $ ./gradlew run -
In separate window generate test clients and configs
create_test_cllients.sh. -
Install grobid client
install_client.sh -
Run test client
.. code-block:: console
user@pc $ source . ./.venv/bin/activate
(.venv) user@pc $ cd test_clients
(.venv) user@pc test_clients $ python client_XXX.py
^^^
Replace XXX with dataset name
Dataset XXX:
time: 1488 ns
files: 666 files
size: 6942 bytes
Done
Exploratory Data Analysis
See EDA.rst <EDA.rst>_.
Results
See results.csv <results.csv>_.