GitHunt
RO

rochitasundar/Twitter-Sentiment-Analysis

Data consists of tweets scrapped using Twitter API. Objective is sentiment labelling using a lexicon approach, performing text pre-processing (such as language detection, tokenisation, normalisation, vectorisation), building pipelines for text classification models for sentiment analysis, followed by explainability of the final classifier

Twitter Sentiment Analysis Lab

Dataset (tweets.csv)

The dataset contains approximately 2000 different (scrapped) tweets with the following attributes:

  • 'id' : unique 19 digit id for each tweet
  • 'created_at' : date & time of each tweet (or retweet)
  • 'text' : tweet details/ description
  • 'location' : origin of tweet

Objective

  • Sentiment label - for each tweet based on it's text, devise a method to assign an appropriate sentiment ('positive', 'negative' or 'neutral'). This is achieved by using TextBlob (https://textblob.readthedocs.io/en/dev/)
  • Text Analytics/NLP - to extract features from tweet texts
  • Machine Learning - Building a robust & optimized ML model to accurately predict the sentiment associated with each tweet & explanation of the built model

Languages

Jupyter Notebook100.0%

Contributors

Created April 3, 2022
Updated January 16, 2024
rochitasundar/Twitter-Sentiment-Analysis | GitHunt