GitHunt
MU

MUHAMMADAKMAL137/IMDB-Dataset-Classification-using-Pre-trained-Word-Embedding-with-GloVec-6B

In this project, I worked with a small corpus consisting of simple sentences. I tokenized the words using n-grams from the NLTK library and performed word-level and character-level one-hot encoding. Additionally, I utilized the Keras Tokenizer to tokenize the sentences and implemented word embedding using the Embedding layer. For sentiment analysis

IMDB-Dataset-Classification-using-Pre-trained-Word-Embedding-with-GloVec-6B

In this project, I worked with a small corpus consisting of simple sentences. I tokenized the words using n-grams from the NLTK library and performed word-level and character-level one-hot encoding. Additionally, I utilized the Keras Tokenizer to tokenize the sentences and implemented word embedding using the Embedding layer. For sentiment analysis, I used the IMDB dataset, which is a popular dataset for binary sentiment classification. To enhance the model's performance, you leveraged the GloVe 6B with 100D pre-trained word embeddings, which provided valuable semantic information to the words in the corpus. Overall, My project focused on text data preprocessing, tokenization, and leveraging pre-trained word embeddings for sentiment analysis. It demonstrated how these techniques can be applied to text data to build effective machine learning models for classification tasks like sentiment analysis.

Languages

Jupyter Notebook100.0%

Contributors

Created August 1, 2023
Updated April 15, 2024
MUHAMMADAKMAL137/IMDB-Dataset-Classification-using-Pre-trained-Word-Embedding-with-GloVec-6B | GitHunt