"topic:parallel-algorithms" — Search

46 results for “topic:parallel-algorithms”

Shape Aware Parallel Mesh Simplification Algorithm

3dmesh-optimizationmesh-simplificationmeshesmultithreadingparallel-algorithmparallel-algorithmsparallel-computingquadric-metricsshapethread-pool

ROCm/HIP-CPU

An implementation of HIP that works on CPUs, across OSes.

C++13124Updated 2 months ago

cpp17cudacuda-programminghiphip-kernel-languagehip-portabilityhip-runtimeparallel-algorithmsspmdstl-algorithms

PartitionedArrays/PartitionedArrays.jl

Large-scale, distributed, sparse linear algebra in Julia.

Julia13121Updated 2 months ago

hpcjulialinear-algebrampiparallel-algorithmsparallel-data

wangyiqiu/hdbscan

A Fast Parallel Algorithm for HDBSCAN* Clustering

C++6314Updated 5 months ago

clusteringhdbscanparallel-algorithms

rbga/CUDA-Merge-and-Bitonic-Sort

Efficient implementations of Merge Sort and Bitonic Sort algorithms using CUDA for GPU parallel processing, resulting in accelerated sorting of large arrays. Includes both CPU and GPU versions, along with a performance comparison.

Cuda210Updated 2 weeks ago

algorithm-implementationarray-sortingbitonic-sortcpluspluscudacuda-cppefficient-sortinggpgpugpu-accelerationgpu-computinggpu-parallelismgpu-programminghigh-performance-computingmerge-sortnvidia-gpuparallel-algorithmsparallel-processingparallel-sortingsorting-algorithmssorting-performance

JuliaAstroSim/ParallelOperations.jl

Basic parallel algorithms for Julia

Julia210Updated 1 month ago

juliaparallelparallel-algorithmsworkers

RapidsAtHKUST/EGSM

Source code and datasets of "Efficient GPU-Accelerated Subgraph Matching", accepted by SIGMOD'23 - By Xibo Sun and Prof. Qiong Luo

Cuda207Updated 1 year ago

gpugraphgraph-algorithmsnvidiaparallel-algorithmssubgraph-matching

jamshed/CaPS-SA

Cache-friendly, Parallel, and Samplesort-based Constructor for Suffix Arrays and LCP Arrays

C++177Updated 4 months ago

algorithmsdata-structureslcp-arrayparallel-algorithmssuffix-arraytext-indexing

luigicapogrosso/HermesBDD

Official implementation of the paper "HermesBDD: A Multi-Core and Multi-Platform Binary Decision Diagram Package" accepted @ DDECS 2023.

C++133Updated 3 months ago

binary-decision-diagramsparallel-algorithms

dominikkempa/psascan

Parallel external memory suffix array construction

C++81Updated 1 year ago

external-memoryparallel-algorithmsstring-indexingsuffix-arraysuffix-array-construction

Shikha-code36/cuda-hft-fundamentals

CUDA implementation of HFT components showcasing GPU acceleration for financial applications. Features limit order book with matching engine and parallel sorting for market data. Demonstrates significant performance gains over CPU implementations.

Cuda60Updated 2 months ago

cppcudacuda-programmingfinancegpu-computinghftmarket-datanvidiaorder-bookparallel-algorithmsparallel-sortparallel-sortingthrust

Bader-Research/ListRanking

Parallel List Ranking for multicore processors

C40Updated 2 years ago

high-performance-computinglist-rankingmulticoremultithreadedparallelparallel-algorithmsparallel-computingparallel-primitivessupercomputing

dominikkempa/pem-bwt

Parallel external memory construction of BWT from SA

C++40Updated 3 years ago

burrows-wheeler-transformbwtbwt-constructionexternal-memoryparallel-algorithmssuffix-array

Bader-Research/MST

MST: Parallel Minimum Spanning Forest

C40Updated 8 months ago

graphgraph-algorithmsminimum-spanning-forestminimum-spanning-treemulticoremultithreadedparallelparallel-algorithmsparallel-primitivessupercomputing

MattiaOldani/Algoritmi-Paralleli-Distribuiti

Algoritmi paralleli e distribuiti

Typst40Updated 1 month ago

algorithmsdistributed-algorithmsparallel-algorithms

salvatorecorvaglia/parallel-algorithms-project

Parallel Cholesky Factorization of a SPD Matrix with MPI

C30Updated 8 months ago

choleskympiparallel-algorithms

joulook/Parallel-Processing-Spring-2021

In this repository you can find all of my projects for Parallel Processing Course when I was in 2nd semester of my master's at SUT.

Java30Updated 2 years ago

cudagpu-programmingjacobi-iterationmap-reducemulticore-programmingmultithreadingopenmpparallel-algorithmsparallel-computingparallel-convex-hullparallel-jacobiparallel-matrix-multiplicationparallel-processingparallel-quick-sortsystolic-arrays

acdmammoths/parallelcubesampling

Implementations of the parallel and sequential cube sampling algorithms presented in the paper "A Scalable Parallel Algorithm for Balanced Sampling" (Alexander Lee, Stefan Walzer-Goldfeld, Shukry Zablah, Matteo Riondato, AAAI'22 Student Abstract).

Python21Updated 2 years ago

balanced-samplingcubesamplingparallel-algorithmssamplingsampling-methods

caseyjkey/CUDA-OpenMP-MPI-Implementations

Parallel computing with CUDA, OpenMP, MPI

C20Updated 1 month ago

cudadistributed-computinggpu-accelerationhigh-performance-computinghpclatency-reductionmemory-optimizationmpiopenmpparallel-algorithms

rick1924/ParallelProductStabilizers

Run-time improvements on the computation of the inner product for stabilizer states, using parallel and sparse implementations

C10Updated 3 years ago

bspmatrixparallel-algorithmsquantum-computingstabilizer-states

deepcloudlabs/dcl115-2021-apr-26

DCL-115: Multi-Threaded Programming in C++17

C++11Updated 3 years ago

cpp-11cpp-14cpp-17multi-threadingmutexparallel-algorithmsparallel-stlrangesthread-safety

GRAYgoose124/fragment_shaders

A collection of my fragment shaders.

GLSL10Updated 1 year ago

code-golffragment-shaderglslgraphics-programmingparallel-algorithmsshaders

mihail-stoica/Distributed-Systems

Distributed Java Applications at Scale, Parallel Programming, Distributed Computing

Java10Updated 1 year ago

apache-kafkaapache-zookeeperdistributed-computingdistributed-databasesdistributed-systemshaproxyjava-http-clientjava-http-servermongodbparallel-algorithmsprotcol-buffers

Bader-Research/biconnected-components

Parallel Biconnected Components

C10Updated 5 years ago

biconnected-componentsgraph-algorithmsmultithreadedparallelparallel-algorithmsparallel-primitives

Bader-Research/SIMPLE

SIMPLE is a framework for implementation of parallel algorithms using our methodology for developing high performance programs running on clusters of SMP nodes. Our methodology is based on a small kernel (SIMPLE) of collective communication primitives that make efficient use of the hybrid shared and message passing environment. We illustrate the power of our methodology by presenting experimental results for sorting integers, two-dimensional fast Fourier transforms (FFT), and constraint-satisfied searching.

C10Updated 5 years ago

cluster-computingparallelparallel-algorithmsparallel-computingparallel-primitivesparallel-programmingsupercomputing

ollie-sara/PProg-FS21

Everything and anything I could implement from the course Parallel Programming in FS21 at ETH Zurich.

Java11Updated 9 months ago

locksparallel-algorithmsparallelism

ucrparlay/Orionet

SPAA'25: Parallel Point-to-Point Shortest Paths and Batch Queries

C++10Updated 5 days ago

graph-algorithmsparallel-algorithmsshortest-path-algorithm

autumnharmony/oographplus

openoffice addon for graphical modelling of parallel algorithms