RO

rochitasundar/Twitter-Sentiment-Analysis

Data consists of tweets scrapped using Twitter API. Objective is sentiment labelling using a lexicon approach, performing text pre-processing (such as language detection, tokenisation, normalisation, vectorisation), building pipelines for text classification models for sentiment analysis, followed by explainability of the final classifier

feature-importance gridsearchcv lemmatization linearsvc-model multinomial-naive-bayes pipeline random-forest regex sentiment-classification stemming-porters stratified-sampling textblob tfidfvectorizer xgboost-model

Twitter Sentiment Analysis Lab

Dataset (tweets.csv)

The dataset contains approximately 2000 different (scrapped) tweets with the following attributes:

'id' : unique 19 digit id for each tweet
'created_at' : date & time of each tweet (or retweet)
'text' : tweet details/ description
'location' : origin of tweet

Objective

Sentiment label - for each tweet based on it's text, devise a method to assign an appropriate sentiment ('positive', 'negative' or 'neutral'). This is achieved by using TextBlob (https://textblob.readthedocs.io/en/dev/)
Text Analytics/NLP - to extract features from tweet texts
Machine Learning - Building a robust & optimized ML model to accurately predict the sentiment associated with each tweet & explanation of the built model

On this page

Languages

Jupyter Notebook100.0%

Contributors

Created April 3, 2022

Updated January 16, 2024

rochitasundar/Twitter-Sentiment-Analysis | GitHunt