68 results for “topic:token-classification”
Lightweight hallucination detection framework for RAG applications
AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models
A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning
Transformer-based models implemented in tensorflow 2.x(using keras).
Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging
Unofficial (Golang) Go bindings for the Hugging Face Inference API
A collection of datasets for Ukrainian language
Lightweight self-hosted span annotation tool
b站 AI日日新 不定期更新使用Python框架完成机器学习、深度学习、数据科学任务
A deep research study introducing the Gene Drift Hypothesis: a framework explaining how tokenomics mutate across market cycles. Analyzes evolutionary forces, selective pressures, behavioral traits, and economic genes that rise, fall, or mutate through bull/bear phases, shaping token species over time.
A research-grade exploration of the Tokenomics Ecological Framework, analyzing how tokens behave as predator, prey, parasite, and symbiotic species. Examines ecosystem interactions, evolutionary pressures, species population cycles, and the dynamics of economic predation, mutation, drift, and long-term survival across market cycles.
A research-grade framework for extracting, classifying, and analyzing the “genetic” behavior of smart contract tokens. Identifies economic traits, supply mutations, fee patterns, permission risks, upgradeability vectors, and scam species using a structured gene taxonomy with risk scoring, HTML reports, and token comparison tools.
Extracting terms from text using XLM-R for token and sequence classification
Token classification using Phobert Models for Vietnamese
CNER: Concept and Named Entity Recognition
bullet: A Zero-Shot / Few-Shot Learning, LLM Based, text classification framework
The MERIT Dataset is a fully synthetic, labeled dataset created for training and benchmarking LLMs on Visually Rich Document Understanding tasks. It is also designed to help detect biases and improve interpretability in LLMs, where we are actively working. This repository is actively maintained, and new features are continuously being added.
Labeled Russian text token-by-token for training models for NER task based samples got from parsing different resources and generated by ChatGPT.
Applied Deep Learning 深度學習之應用 by Vivian Chen 陳縕儂 at NTU CSIE
A Java NLP application that identifies names, organizations, and locations in text by utilizing Hugging Face's RoBERTa NER model through the ONNX runtime and the Deep Java Library.
Implementation of the paper, MAPLE - MAsking words to generate blackout Poetry using sequence-to-sequence LEarning, ICNLSP 2021
Data and code for the paper "ID10M: Idiom Identification in 10 Languages" (NAACL 2022).
Text preprocessing and PII anonymisation for NLP/ML. ONNX NER ensemble, language detection, stopword removal. Built for statistical ML and language models.
The Learning Agency Lab - PII Data Detection || Develop automated techniques to detect and remove PII from educational data.
Multi-task NLP Annotation Framework
Links to my repositories, where I implement a wide variety of Natural Language Processing models using TensorFlow and Hugging Face.
Generative adversarial approach to most popular NLP tasks
The default way to fine-tune BERT is wrong. Here is why
Fine tuning 🤗 transformer model for softskill NER task
This app searches reddit posts and comments to determine if a product or service has a positive or negative sentiment and predicts top product mentions using Named Entity Recognition