GitHunt
BI

bigforehead/Twitter-Malicious-Bot-Detection-Project

Applied Machine Learning and Natural Language Processing to build a Random Forest Classifier that filters malicious bots with 97% of accuracy.

Twitter-Malicious-Bot-Detection-Project

Introduction

  • "Can my product ads reach real users on Instagram and avoid bots (ex: fake followers) ?". YES! but HOW? The answer for this question can be found here. As a group of data scientists at Fordham, we worked in a team to build a Random Forest Classifier with 97% accuracy to help filter out those social bots utilizing Twitter data. The model can applied for any other social media platforms such as Instagram, Fakebook, and etc. So this model can help you improve your ad revenue by maximize the possibility of Ad organic exposure on social media.

Goal

  • The project's purpose is to classify three types of malicious twitter bots of Fake Followers, Spam Bots, and Scam Bots.

Methodology

  • Scrape Tweets using Twitter API (Tools: Python).
  • Machine Learning: Random Forest. (Tools: SPSS Modeler)
  • Natural Language Processing: Derive new TFIDF features to analyze Tweets for three identified bots (Tools: nltk).

Result

  • A Random Forest model with accuracy rate of 97%.

Languages

Jupyter Notebook100.0%

Contributors

Created March 28, 2020
Updated February 27, 2025