Natural Language Processing
s-nlp
Code of research papers and useful NLP software and datasets
Languages
Top Repositories
Materials of transformers lecture course
http://nlp.seas.harvard.edu/2018/04/03/attention.html
Code and data of "Methods for Detoxification of Texts for the Russian Language" paper
Models for automatically transforming toxic text to neutral
Data and info for the paper "ParaDetox: Text Detoxification with Parallel Data"
Repositories
65Materials of transformers lecture course
Detecting overflow in compressed token representations for RAG
No description provided.
Ridic
[ACL 2025] Adaptive Retrieval without Self-Knowledge? Bringing Uncertainty Back Home
No description provided.
Models for automatically transforming toxic text to neutral
http://nlp.seas.harvard.edu/2018/04/03/attention.html
Code and data of "Methods for Detoxification of Texts for the Russian Language" paper
No description provided.
ConflictBench
No description provided.
This repository provides the code implementation for the paper: "Memory Efficient LM Compression using Fisher Information from Low-Rank Representations"
No description provided.
The PsiloQA pipeline automates the construction of a multilingual, span-level hallucination detection dataset with contexts.
Official implementation of NAACL 2025 SRW paper "SkipCLM: Enhancing Crosslingual Alignment of Decoder Transformer Models via Contrastive Learning and Skip Connection"
Data and info for the paper "ParaDetox: Text Detoxification with Parallel Data"
SHROOM SemEval shared task code and data
Project on transformers compression
RUSSE 2022: Russian Text Detoxification Based on Parallel Corpora
No description provided.
No description provided.
No description provided.
The code related to the paper
Official implementation of NAACL 2025 Main Conference Paper "Modern LLMs are Few-Shot Parallel Detoxification Data Annotators"
Through the Looking Glass
No description provided.
Data from "Crowdsourcing of Parallel Corpora: the Case of Style Transfer for Detoxification" paper
A code for PAN-2024 Multilingual Text Detoxification
Dataset for evaluating the quality of content preservation measure in text formality transfer for task oriented domain