Base16384-SYCL

A high-performance Base16384 encoding library implemented using Intel SYCL for accelerated computation on heterogeneous hardware platforms.

Overview

Note

This library requires Intel oneAPI DPC++/SYCL runtime. Please ensure proper environment setup before building and running the applications.

Base16384-SYCL is an optimized implementation of the Base16384 encoding algorithm that leverages Intel SYCL (oneAPI Data Parallel C++) to achieve superior performance on both CPU and GPU architectures. The library provides efficient encoding and decoding capabilities while maintaining cross-platform compatibility.

Features

Hardware Acceleration: Utilizes Intel SYCL for parallel processing on CPUs, GPUs, and other accelerators
Cross-Platform Support: Compatible with Windows and Unix-like systems
Performance Optimized: Includes vectorization and memory optimization for maximum throughput
Robust Error Handling: Comprehensive exception handling with detailed error reporting
Modern C++: Written in C++20 with modern programming practices

Prerequisites

Required Dependencies

Intel oneAPI Toolkit: DPC++/SYCL compiler and runtime
CMake: Version 3.4 or higher

Windows-Specific Requirements

Visual Studio Build Tools or Visual Studio IDE
Intel DPC++ compiler (icx-cl)
NMake (included with Visual Studio)

Unix/Linux Requirements

Intel DPC++ compiler (icpx)
Standard build tools (make, etc.)

Installation

1. Environment Setup

Tip

For VS Code Users: If you're using Visual Studio Code, the environment variable setup commands will be executed automatically when you open a terminal. If this fails, it may be due to a non-standard installation path. Please modify the paths in .vscode/settings.json accordingly.

Windows:

# Navigate to your Intel oneAPI installation directory
# Typically: C:\Program Files (x86)\Intel\oneAPI\
setvars.bat

Linux/Unix:

# Navigate to your Intel oneAPI installation directory
# Typically: /opt/intel/oneapi/
source setvars.sh

2. Build Process

Clone and navigate to the project:

git clone https://github.com/fumiama/base16384-sycl.git
cd base16384-sycl
mkdir build
cd build

Configure the build system:

Add -DBUILD=test to enable testing.

Windows

cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release ..

Unix-Like
```
cmake -DCMAKE_BUILD_TYPE=Release ..
```

Compile the project:

cmake --build .

3. Testing

Run the test suite:

ctest

4. Performance Analysis with Intel VTune

Intel VTune Profiler is a powerful performance analysis tool that can help you identify bottlenecks and optimize the applications.

Prerequisites

Intel VTune Profiler (included in Intel oneAPI Base Toolkit)
Compiled Base16384-SYCL application or tests with debug symbols (use RelWithDebInfo build type)

Running VTune Analysis

1. Launch VTune GUI:

vtune-gui

2. Create a New Project:

Click "New Project" in the welcome screen
Set project name and location
Configure the target application path

3. Configure Analysis Type:

Choose an analysis type based on your profiling goals:

Hotspots Analysis: Identify CPU-intensive functions
GPU Offload Analysis: Analyze GPU kernel performance and host-device data transfer
Memory Consumption: Track memory usage patterns
Threading Analysis: Detect threading issues and analyze parallelism

4. Run the Analysis:

Click the "Start" button to begin profiling
VTune will execute your application and collect performance data

5. Analyze Results:

Key metrics to examine:

Kernel Execution Time: Time spent in SYCL kernels
Memory Transfer Overhead: Host-to-device and device-to-host data transfer time
CPU Utilization: Host CPU usage during GPU operations
GPU Utilization: GPU compute unit occupancy

Optimization Tips

Based on VTune analysis, consider these optimization strategies:

Reduce Host-Device Transfer: Minimize data copying between CPU and GPU
Increase Kernel Occupancy: Optimize work-group sizes and global range
Use Shared Memory: Leverage local memory for frequently accessed data
Batch Operations: Process larger data chunks to amortize kernel launch overhead

Build Configuration

The project supports multiple build configurations:

Release: Optimized for maximum performance (-O3, /O2)
Debug: Includes debugging symbols and reduced optimization
RelWithDebInfo: Release optimization with debug information
MinSizeRel: Optimized for minimal binary size

Compatibility

Operating Systems: Windows 10/11, Linux, macOS
Architectures: x86-64, ARM64 (where Intel oneAPI is supported)
Hardware: Intel CPUs, Intel GPUs, NVIDIA GPUs (via Level Zero), AMD GPUs (experimental)

Contributing

Contributions are welcome! Please ensure that:

Code follows the existing style and conventions
All tests pass (ctest)
New features include appropriate test coverage
Documentation is updated for significant changes

License

This project is licensed under the GNU General Public License v3.0 (GPL-3.0). See the LICENSE file for detailed information.

Acknowledgments

Intel oneAPI team for the SYCL implementation
Base16384 algorithm developers
Contributors to the open-source community

fumiama/base16384-sycl

Base16384-SYCL

Overview

Features

Prerequisites

Required Dependencies

Windows-Specific Requirements

Unix/Linux Requirements

Installation

1. Environment Setup

2. Build Process

3. Testing

4. Performance Analysis with Intel VTune

Prerequisites

Running VTune Analysis

Optimization Tips

Build Configuration

Compatibility

Contributing

License

Acknowledgments

On this page

Languages

Contributors