Fei Han

hanfei1986

Senior AI Engineer @ Microsoft | Ex-Amazon | Ex-MIT | Machine Learning and Generative AI

Seattle

Languages

Jupyter Notebook90%Python10%

Repos

Stars

Forks

Top Language

Jupyter Notebook

Loading contributions...

Top Repositories

Impute-missing-data-with-XGBoost

When signaficant amount of data in highly-important features are missing, what can we do? Impute the missing data with mean or median? In this Juyter notebook, I demonstrate embedding a XGBoost model to do the data imputation in the data transformer.

4Jupyter Notebook

Batch-reading-of-neural-network-training-and-visualization-of-loss

When training data is bigger than memory, we can feed the training data to neural network training in multiple batches. This notebook demostrates how to do it and visualizes the training and test losses.

1Jupyter Notebook

Random-forest-and-RFECV

This Jupyter notebook demonstrates a Recursive Feature Elimination with Cross-Validation (RFECV) feature selection process with a random forest model.

1Jupyter Notebook

Image-processing-and-optical-character-recognition-with-tesseract

This Python program is used to pre-process images and recognize characters in them (OCR) with pytesseract in a batch-processing way.

1Python

Oversampling-of-imbalanced-data-with-RandomOverSampler-SMOTE-and-ADASYN

Imbalanced data commonly exist in real world, especially in anomaly-detection tasks. Handling imbalanced data is important to the tasks, otherwise the predictions are biased towards the majority class. RandomOverSampler, SMOTE, and ADASYN are useful oversampling tools to fabricate data for minority classes and make the dataset balanced.

1Jupyter Notebook

Build-a-chatbot-powered-by-GPT-3.5-using-Streamlit

https://chatbot-v2.streamlit.app/

1Python

Repositories

hanfei1986/Impute-missing-data-with-XGBoost

Jupyter Notebook40Updated 2 years ago

data-imputationfeature-engineeringmachine-learningxgboost

hanfei1986/Batch-reading-of-neural-network-training-and-visualization-of-loss

Jupyter Notebook10Updated 2 years ago

batch-readingdeep-learningloss-functionsmachine-learningneural-network

hanfei1986/Random-forest-and-RFECV

This Jupyter notebook demonstrates a Recursive Feature Elimination with Cross-Validation (RFECV) feature selection process with a random forest model.

Jupyter Notebook10Updated 2 years ago

machine-learningrandom-forestrfecv

hanfei1986/Image-processing-and-optical-character-recognition-with-tesseract

This Python program is used to pre-process images and recognize characters in them (OCR) with pytesseract in a batch-processing way.

Python10Updated 1 year ago

image-processingocr

hanfei1986/Oversampling-of-imbalanced-data-with-RandomOverSampler-SMOTE-and-ADASYN

Jupyter Notebook10Updated 2 years ago

imbalanced-datamachine-learningover-sampling

hanfei1986/Build-a-chatbot-powered-by-GPT-3.5-using-Streamlit

https://chatbot-v2.streamlit.app/

Python10Updated 1 year ago

chatbot

hanfei1986/To-Python-beginners

No description provided.

Jupyter Notebook00Updated 1 year ago

hanfei1986/Estimate-the-area-of-a-region-using-a-Monte-Carlo-simulation

Monte Carlo simulation is a computational technique that uses random sampling and statistical methods to estimate the behavior of complex systems or solve problems. It is particularly useful when dealing with problems that involve a high degree of randomness or complexity.

Fei Han

Languages

Loading contributions...

Top Repositories

Repositories

Gists

Recent Activity