francescobaio/NLP_Assignments
Assignments in the realm of Natural Language Processing for Sexism Detection, developed as part of the NLP course at the University of Bologna.
NLP_Assignments
Overview of the Tasks
This repository addresses two key challenges in sexism detection using Natural Language Processing (NLP):
π Objective: Detecting and classifying sexism in textual data extracted from tweets.
π Overview:
Our analysis compared the performance of LSTM and Transformer models, with RoBERTa significantly outperforming the Custom Model. This success is attributed to RoBERTaβs larger number of parameters, Pretraining on a more extensive corpus and Fine-tuning on the specific task.
β Key Takeaways:
- Pre-trained transformer architectures like RoBERTa are highly effective for complex tasks such as sexism detection.
- Overassociation of the term βwomanβ with sexism, leading to systematic misclassifications.
π Objective: Classifying sexist content in textual data extracted from tweets.
π Overview:
This task focuses on the exploration of Prompting techniques using Large Language Models (LLMs): the goal is to assess their capability to understand nuanced contexts in sexist content and classify complex and ambiguous cases effectively.
β Key Takeaways:
- Few-shot prompting proves to be the most effective technique for this type of task.
- The Chain of Thought (CoT) technique positively impacts model performance by enhancing reasoning capabilities.
- Experiments on a smaller dataset (CLEF EXIST Task 1, 2023) highlight the importance of fine-tuning, which:
- Enables the use of lightweight models that outperform larger LLMs.
- Aligns with recent research findings (Bucher and Martini, 2024).
π Project Structure
Hereβs an overview of the repository structure:
.
βββ assignment_1/ # Files related to Assignment 1
β βββ html/ # HTML report for Assignment 1
β βββ assignment_1.ipynb # Jupyter notebook
β βββ assignment_1.pdf # PDF report for Assignment 1
β
βββ assignment_2/ # Files related to Assignment 2
β βββ html/ # HTML report for Assignment 2
β βββ assignment_2.ipynb # Jupyter notebook
β βββ assignment_2.pdf # PDF report for Assignment 2
β
βββ docs/ # Documentation files
β βββ index.html # Main entry for online documentation
β
βββ LICENSE # Project license
βββ README.md # This file
βββ _quarto.yml # Quarto configuration file
βββ index.qmd # Source file for the homepage
π© Contacts
- Francesco Baiocchi (francesco.baiocchi2@studio.unibo.it)
- Christian Di BuΓ² (christian.dibuo@studio.unibo.it)
- Leonardo Petrilli (leonardo.petrilli@studio.unibo.it)
π Documentation
Access the full documentation and assignment reports directly from this page: Online Documentation