Repos
121
Stars
819
Forks
338
Top Language
Python
Loading contributions...
Top Repositories
Scientific Document Summarization Corpus and Annotations from the WING NUS group.
Source code for the ACL 2018 paper entitled "Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures" by Wenqiang Lei et al.
Neuralized version of the Reference String Parser component of the ParsCit package.
This repository contains code and models for the paper: Semantic Graphs for Generating Deep Questions (ACL 2020).
Code and Dataset for the Bhola et al. (2020) Retrieving Skills from Job Descriptions: A Language Model Based Extreme Multi-label Classification Framework
The Web IR / NLP Group (WING)'s public reading group at the National University of Singapore.
Repositories
121Hugo Blox WING Website pilot
No description provided.
Scientific Document Summarization Corpus and Annotations from the WING NUS group.
The Web IR / NLP Group (WING)'s public reading group at the National University of Singapore.
An open, large-scale, interactive textbook.
The Dataset and Official Implementation for <Discursive Circuits: How Do Language Models Understand Discourse Relations?> @ EMNLP 2025
The Summarizer from the Web IR / NLP Group (WING), hence SWING, is a modular, state-of-the-art automatic extractive text summarization system. It is used as the basis for summarization research at the National University of Singapore. It performs as one of the leading automatic summarization systems in the international TAC competition, getting high marks for the ROUGE evaluation measure
Code and Dataset for the Bhola et al. (2020) Retrieving Skills from Job Descriptions: A Language Model Based Extreme Multi-label Classification Framework
Official codes for ACL 2025 paper "Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines"
This repository contains code and models for the paper: Semantic Graphs for Generating Deep Questions (ACL 2020).
No description provided.
No description provided.
The Dataset and Official Implementation for <The ELCo Dataset: Bridging Emoji and Lexical Composition> @ LREC-COLING 2024
A crawler for Coursera
No description provided.
Source code for the ACL 2018 paper entitled "Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures" by Wenqiang Lei et al.
This is the distribution point for the NUS SMS Corpus as described and updated from This is a corpus of SMS (Short Message Service) messages collected for research at the Department of Computer Science at the National University of Singapore. This dataset consists of 67,093 SMS messages taken from the corpus on Mar 9, 2015. The messages largely originate from Singaporeans and mostly from students attending the University. These messages were collected from volunteers who were made aware that their contributions were going to be made publicly available. The data collectors opportunistically collected as much metadata about the messages and their senders as possible, so as to enable different types of analyses. This corpus was collected by Tao Chen and Min-Yen Kan. If you use this data, please ensure the following paper is cited. For more details, please refer to Citation field. Tao Chen and Min-Yen Kan (2013). Creating a Live, Public Short Message Service Corpus: The NUS SMS Corpus. Language Resources and Evaluation, 47(2)(2013), pages 299-355. URL: https://link.springer.com/article/10.1007%2Fs10579-012-9197-9
No description provided.
Student Submission Integrity Diagnosis
JavaRAP is an implementation of the classic Resolution of Anaphora Procedure (RAP) given by Lappin and Leass (1994) . It resolves third person pronouns, lexical anaphors, and identifies pleonastic pronouns. The original purpose of the implementation is to provide anaphora resolution result to our TREC 2003 Q&A system.
Neuralized version of the Reference String Parser component of the ParsCit package.
This is the repository for NAACL'25 paper "TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning"
No description provided.
Source code for the COLING 2018 paper entitled "Identifying Emergent Research Trends by Key Authors and Phrases" by Shenhao Jiang et al.
Source code for the AAAI 2018 paper entitled "Linguistic Properties Matter for Implicit Discourse Relation Recognition: Combining Semantic Interaction, Topic Continuity and Attribution" by Wenqiang Lei et al.
[EMNLP 2024] Multi-expert Prompting Improves Reliability, Safety and Usefulness of Large Language Models
No description provided.
Code for the EMNLP 2020 paper "Re-examining the Role of Schema Linking in Text-to-SQL".
[Preprint' 24] LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs
[Preprint' 24] LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs