129 results for “topic:nlp-resources”
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Portuguese pre-trained BERT models
The hands-on NLTK tutorial for NLP in Python
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
chinese NLP corpus of chinese science fiction,chinese science fiction corpus : About 4675 Chinese science fiction novels 大约有4675本科幻小说,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Projects and useful articles / links
My NLP datasets for Russian language
A curated list of beginner resources in Natural Language Processing
A lexicon for Sudachi
This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).
A curated list of NLP resources for Hungarian
Resource NLP & Bahasa
A Dutch RoBERTa-based language model
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition (ACL 2020)
summaries of all the papers I read
chinese NLP corpus of chinese science fiction, chinese science fiction corpus: Archive of the Ark Plan of Ula Science Fiction Website 乌拉科幻小说网方舟计划存档,中文科幻小说自然语言处理语料库,中文科幻小说文本语料库,中文科幻小说文本数据库,科幻小说语料
Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.
A modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.
An open information extraction system that provides compact extractions
No description provided.
Arabic NLP tools List inventory
A Python module that fetches a page of a word/phrase from the Online Indonesian Dictionary (https://kbbi.kemdikbud.go.id).
Linguistic Datasets for Portuguese: Lista de conjuntos de dados linguísticos para língua portuguesa com licença flexíveis: banco de dados, lista de palavras, sinônimos, antônimos, dicionário temático, tesauro, linked data, semântica, ontologia e representação de conhecimento
Resources to go with the Indic NLP Library
A list of Romanian NLP Datasets
Dive into the world of Arabic NLP with this extensive collection of resources, tools, datasets, and best practices tailored for the Arabic language.
Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
Assignment solutions for CS224N: Natural Language Processing with Deep Learning - Stanford / Winter 2023
Natural Language Processing Courses with Resources
A python package for removing duplicate text in clinical notes or other documents