Paul Lerner
PaulLerner
Postdoc at Sorbonne Université, CNRS, ISIR
Languages
Repos
52
Stars
52
Forks
37
Top Language
Python
Loading contributions...
Top Repositories
Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retrieval (Lerner et al., ECIR'24)
🤔 A Python Library to Compute LLM's Perplexity and Surprisal
Automatic translation from Standard to Inclusive French, and vice-versa
Repository for the first Practical Work of ENSAE's Deep Learning class 2024-2025
interface for annotating wug tests
Repositories
52interface for annotating wug tests
🤔 A Python Library to Compute LLM's Perplexity and Surprisal
Python utils to process LaTeX
Main repository for the 2024-2026 Natural Language Processing class at aivancity by Paul Lerner
Postdoc at Sorbonne Université, CNRS, ISIR
No description provided.
Dataset and code for the paper "Assessing the Political Fairness of Multilingual LLMs: A Case Study based on a 21-way Multiparallel EuroParl Dataset" (Lerner and Yvon, 2025)
A LaTeX template for an ANR-DFG grant proposal.
Automatic translation from Standard to Inclusive French, and vice-versa
Machine Learning applied to Natural Language Processing Toolkit used in the Lisbon Machine Learning Summer School
Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retrieval (Lerner et al., ECIR'24)
Lisbon Machine Learning Summer School Lab Guide
Scrape affluences.com
Repository for the second Practical Work of ENSAE's Deep Learning class 2024-2025.
Repository for the first Practical Work of ENSAE's Deep Learning class 2024-2025
Multilingual sentence alignment using sentence embeddings
Toolkit to compile a comparable/parallel corpus from European Parliament proceedings
Source code and data for the papers by Lerner and Yvon: Towards the Machine Translation of Scientific Neologisms / Unlike “Likely”, “Unlike” is Unlikely: BPE-based Segmentation hurts Morphological Derivations in LLMs
A tool for Lexematic Segmentation by Paul Lerner
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Data for the paper Unlike “Likely”, “Unlike” is Unlikely: BPE-based Segmentation hurts Morphological Derivations in LLMs (Lerner and Yvon, 2025)
No description provided.
Symptoms subset of TERMIUM
No description provided.
No description provided.
Hit the fork button!
No description provided.
Map IMDb to Allociné, for the main purpose of collecting French press reviews/ratings.
Data loader for pyannote.db.plumcot
Build and train PyTorch models and connect them to the ML lifecycle using Lightning App templates, without handling DIY infrastructure, cost management, scaling, and other headaches.