HA
hamdiboukamcha/SPEED-SAM-C-TENSORRT
SPEED SAM C++ TENSORRT: A high-performance C++ implementation using TensorRT and CUDA
SPEED SAM C++ TENSORRT
๐ Overview
A high-performance C++ implementation for SAM (segment anything model) using TensorRT and CUDA, optimized for real-time image segmentation tasks.
๐ข Updates
Model Conversion: Build TensorRT engines from ONNX models for accelerated inference.
Segmentation with Points and BBoxes: Easily segment images using selected points or bounding boxes.
FP16 Precision: Choose between FP16 and FP32 for speed and precision balance.
Dynamic Shape Support: Efficient handling of variable input sizes using optimization profiles.
CUDA Optimization: Leverage CUDA for preprocessing and efficient memory handling.
๐ข Performance
Infernce Time
| Component | SpeedSAM |
|---|---|
| Image Encoder | |
| Parameters | 5M |
| Speed | 8ms |
| Mask Decoder | |
| Parameters | 3.876M |
| Speed | 4ms |
| Whole Pipeline (Enc+Dec) | |
| Parameters | 9.66M |
| Speed | 12ms |
Results
๐ Project Structure
SPEED-SAM-CPP-TENSORRT/
โโโ include
โ โโโ config.h # Model configuration and macros
โ โโโ cuda_utils.h # CUDA utility macros
โ โโโ engineTRT.h # TensorRT engine management
โ โโโ logging.h # Logging utilities
โ โโโ macros.h # API export/import macros
โ โโโ speedSam.h # SpeedSam class definition
โ โโโ utils.h # Utility functions for image handling
โโโ src
โ โโโ engineTRT.cpp # Implementation of the TensorRT engine
โ โโโ main.cpp # Main entry point
โ โโโ speedSam.cpp # Implementation of the SpeedSam class
โโโ CMakeLists.txt # CMake configuration
๐ Installation
Prerequisites
git clone https://github.com/hamdiboukamcha/SPEED-SAM-C-TENSORRT.git
cd SPEED-SAM-CPP-TENSORRT
# Create a build directory and compile
mkdir build && cd build
cmake ..
make -j$(nproc)
Note: Update the CMakeLists.txt with the correct paths for TensorRT and OpenCV.
๐ฆ Dependencies
CUDA: NVIDIA's parallel computing platform
TensorRT: High-performance deep learning inference
OpenCV: Image processing library
C++17: Required standard for compilation
๐ Code Overview
Main Components
SpeedSam Class (speedSam.h): Manages image encoding and mask decoding.
EngineTRT Class (engineTRT.h): TensorRT engine creation and inference.
CUDA Utilities (cuda_utils.h): Macros for CUDA error handling.
Config (config.h): Defines model parameters and precision settings.
Key Functions
EngineTRT::build: Builds the TensorRT engine from an ONNX model.
EngineTRT::infer: Runs inference on the provided input data.
SpeedSam::predict: Segments an image using input points or bounding boxes.
๐ Contact
For advanced inquiries, feel free to contact me on LinkedIn: ![]()
๐ Citation
If you use this code in your research, please cite the repository as follows:
@misc{boukamcha2024SpeedSam,
author = {Hamdi Boukamcha},
title = {SPEED-SAM-C-TENSORRT},
year = {2024},
publisher = {GitHub},
howpublished = {\url{https://github.com/hamdiboukamcha/SPEED-SAM-C-TENSORRT//}},
}