GitHunt
CA

caballeto/benchmark-flair-inference

benchmark-flair-inference

Tested ner-fast:

  • Input size: 670k characters
  • Average article size: 2500 characters
  • Language: english

CPU performance

  • CPU: 8, 1.6 GHz Intel Core i5
  • Total time: 6 mins
  • Time per 1k chars: 0.53s
  • Articles per seconds: 0.7

GPU performance

  • GPU: Tesla V100
  • Card memory: 16 GB
  • Accelerator: CUDA
  • Total time: 20 seconds
  • Time per 1k chars: 0.03 seconds
  • Articles per seconds: 13.3

Conclusion

Running on GPU offers ~20x performance boost comparing with CPU performance. I suppose, that it can also be run in parallel threads, because GPU has enough memory
to handle multiple models. I assume at least 4 models could be run in parallel, therefore the performance will be ~50 articles per second. Which is more than
needed for our use case.

Contributors

Created January 3, 2021
Updated January 6, 2021