JA
jawrainey/hfta
Reference implementation: run any huggingface tokenizer in Android (rust).
HuggingFace Tokenizers on Android (HFTA)
Reference implementation using HuggingFace's (HF) tokenizers in Android.
Demo Video
UI to show text to tokens via the tokenizers library in real-time on Android at demo/demo.mp4:
demo.mp4
Try a Tokenizer
- Find a model you want to test on HF, e.g., Google's gemma-3-4b-it
- Download and add the
tokenizer.jsontoapp/src/main/assetsnamedgemma-3-4b-it.json - Modify
SELECTED_TOKENIZERinapp/build.gradle.kts
Features
- Run any HuggingFace's (HF) tokenizers on-device in Android.
rusttojavaNDK bindings of HF's tokenizers inrs-hfta- Use of JNI bindings between rust and Android
- Parameterized instrumentation tests (runs on-device)
- compiler optimizations to reduce lib filesize
Implementation Details
Run any HF's tokenizer on Android using the associated tokenizers.json from huggingface.co. To achieve that, the HF library is built via rust into a shared library and uses Java Native Interface (JNI) to load the library.
Thanks to
- Hugging Face's
tokenizerslibrary - Qualcomm's Genie Library has a rust to C++ static library implementation of HF's tokenizers at
qairt/2.34.0.250424/examples/Genie/Genie/src/qualla/tokenizers/rust - Shubham Panchal's
Sentence-Embeddings-Android - Rust's
profiledocs