Bram Vanroy
BramVanroy
👋 My name is Bram and I work on natural language processing and machine translation (evaluation) but I also spend a lot of time in this open-source world 🌍
Languages
Top Repositories
A small repo showing how to easily use BERT (or other transformers) for inference
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
An open, efficient LLM for Dutch
An example of how to use spaCy for extremely large files without running into memory issues
MAchine Translation Evaluation Online (MATEO)
An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text, to find differences and similarities between two different translations, or to see how a machine translation differs from a reference translation.
Repositories
35An open, efficient LLM for Dutch
Software to create C5, Common Crawl crawls, annotated with potential Creative Commons license information
MAchine Translation Evaluation Online (MATEO)
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Notebooks on improving models with synthetic data and comparing model improvements
No description provided.
Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doc and its sentences and tokens. Can also be used as a command-line tool.
An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text, to find differences and similarities between two different translations, or to see how a machine translation differs from a reference translation.
No description provided.
Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ for the code
Tools for checking ACL paper submissions
No description provided.
An adapted version of the Ionic conference app for CLIN35
Segmentation interface for the TPR-DB to manually tokenize and sentence segment
Download and load spaCy models on-the-fly
Robust recipes to align language models with human and AI preferences
No description provided.
A small repo showing how to easily use BERT (or other transformers) for inference
A Python package to interact with the CLARIN SPF API to retrieve the 'logged in' cookies necessary to potentially interact with APIs of services that require the authentication.
No description provided.
An example of how to use spaCy for extremely large files without running into memory issues
No description provided.
Sentence-Level Text Simplification for Dutch
No description provided.
No description provided.
A word aligner based on multilingual encoders
Benchmarking throughput of MBART
Demo app to illustrate ASTrED
Transformer trainer for variety of classification problems that has been used in-house at LT3 for different research topics.
No description provided.