GitHunt
WE

Wenox/fast-fw

Optimized implementation of Floyd Warshall algorithm using modern AVX2.

Fast Floyd Warshall

Optimized implementation of Floyd Warshall algorithm using modern AVX2 extended instruction set.

SIMD vector instructions allow efficient parallelization on the CPU level.

The results become even better when the graph weights are limited to the range of 8 or 16 bits.

Results

Assembly implementation completely outperforms C++ implementation even when -O3 compiler flag is specified.

This was tested on both sparse and dense graphs of large sizes.

Comparison including C++

GitHub Logo
GitHub Logo

Comparison excluding C++

GitHub Logo
GitHub Logo

November, 2020.

Languages

C++64.8%Assembly30.5%C4.8%

Contributors

MIT License
Created July 25, 2021
Updated April 15, 2024