GitHunt
AB

Abrar2652/IEEE-CIS-Fraud-Detection-Project

This is the first project to be completed in Upskill ISA Intelligent Machines. The project was done after the end of the competition. The XGBClassifier used in this model obtained 0.950844 public scores on Kaggle.

IEEE-CIS-Fraud-Detection-Project

Description

This project is a part of the Machine Learning Course provided by Upskill Income Sharing Agreement program with Intelligent Machines.
An already ended competition dataset has been selected as this project where different machine learning models were benchmarked. The data contains real-world e-commerce transactions from Vesta. It contains a wide range of features from device type to product features. The competitors were to develop a machine learning model to predict if the transaction is fraud or not fraud. This project targets to improve the efficacy of fraudulent transaction alerts for millions of people around the world, helping hundreds of thousands of businesses reduce their fraud loss and increase their revenue

Dataset

Getting Started

The main challenge of this project is the gigantic amount of features and it's difficult to remove the unnecessary features where we don't know which factors to consider while choosing features. Training the machine learning models on these all features will waste a lot of time and obviously won't obtain better score. The main starting point should be data exploration, data cleaning, dealing with the null values, feature engineering.

Dependencies

Programming language: Python

Libraries: NumPy, Pandas, Matplotlib, Seaborn, scikit-learn, XGBClassifier

Environment: Kaggle Notebook

Executing program

Help

If you face difficulties running the model on your local machine or Google Colab Notebook, then check if you are running the Kernel on CPU or GPU. If you're running on CPU, change the runtime to GPU. I ran this notebook with 4 GB RAM, 2.4 GHz Intel(R) Core(TM) i3 CPU. I faced a lot of difficulties including sudden shutdown due to overheating, running out of my resources, etc. Kaggle environment worked well for me.

Authors

Md. Abrar Jahin

LinkedIn

License

This project is licensed under the [Apache License 2.0] License - see the LICENSE.md file for details

Acknowledgments

StackOverflow, Towards Data Science articles, Data Exploration and Feature Engineering Techniques of Kaggle Grandmasters, DataCamp

Languages

Jupyter Notebook100.0%

Contributors

Apache License 2.0
Created June 10, 2021
Updated June 10, 2021