73 results for “topic:sse2”
Implementations of SIMD instruction sets for systems which don't natively support them.
Fast and exact implementation of the C++ from_chars functions for number types: 4x to 10x faster than strtod, part of GCC 12, MySQL, Chromium, Redis and WebKit/Safari
A simple and fast linear algebra library for games and graphics
Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension, LoongArch64, POWER. Part of Node.js, WebKit/Safari, Ladybird, Chromium, Cloudflare Workers and Bun.
WHATWG-compliant and fast URL parser written in modern C++, part of Internet Archive, Node.js, Clickhouse, Redpanda, Kong, Telegram, Adguard, Datadog and Cloudflare Workers.
Expressive Vector Engine - SIMD in C++ Goes Brrrr
Fastest Integer Compression
SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
fccf: A command-line tool that quickly searches through C/C++ source code in a directory based on a search string and prints relevant code snippets that match the query.
Agenium Scale vectorization library for CPUs and GPUs
TurboRLE-Fastest Run Length Encoding
Boost SIMD
Single-header c99 accelerated float/double parsing. Port of the fast_float library.
A general purpose machine code manipulation library for x86-32 (IA-32) and x86-64 (AMD64) architectures (Assembler, Disassembler, Library).
SIMD macro assembler unified for ARM, MIPS, PPC and x86
Optimized Recursive Bilateral Filter
Fastest Histogram Construction
Low level generic SIMD wrapper for x86, ARM, WASM with dynamic dispatch
A header only ready to include mirror of the HIIR library by Laurent De Soras, an oversampling and Hilbert transform library in C++, with additional support for double precision on ARM AArch64 using Neon.
A simple demo shows how to use the SIMD,Single Instruction Multiple Data, to optimize and accelerate the FFT algorithm.
x64 Assembly Demo Framework
Realtime raytracer using SIMD on ARM, MIPS, PPC and x86
A high-performance C library for Longest Prefix Match (LPM) lookups, supporting both a multi-bit trie of 8-bit stride for IPv4 and a wide multi-level 16-bit stride for IPv6, featuring runtime dynamic SIMD dispatching (SSE2, SSE4.2, AVX, AVX2, AVX512F) for optimal performance and throughput on any CPU architecture.
Simple example for embedding SSE2 assembly in Cython projects
A multi-arch library implementing the Argon2 password hashing algorithm.
Software implementation of ARM and x86 SIMD intrinsics
x64/SSE2 and AArch64/NEON SIMD layer in a single C/C++ header file, with functions/classes
Fast BSON to JSON string transcoder
Operator overloading for vector matrix operation using Intel SIMD SSE/SSE2/SSE3 instructions written in Free Pascal
ChaCha20 C SIMD implementations - AVX512, AVX2, SSE2