Andi Osika
andiosika
CSR and operations specialist who fell in love with philanthropy and more so with data. 'Learning how to use it as a superpower for do-gooding!
Languages
Repos
303
Stars
17
Forks
7
Top Language
Jupyter Notebook
Loading contributions...
Top Repositories
Natural Language Processing: A multi-headed model capable of detecting different types of online discussion toxicity like threats, obscenity, insults, and identity-based hate using Keras RNN LSTM and focal loss to address a hyper-imbalanced dataset.
SQL & A/B / Hypothesis Testing to inform business intelligence
Using binomial classification to predict COVID-19 infection on a large dataset (>618K samples) with extreme imbalance and minority class (.13% of samples) as target. The final iteration is a manually tuned random forsest classifier with >95% accuracy and >64% recall.
Sentiment Analysis using Natural Language Processing (NLP) Multi-Classification Support Vector Modeling for & Clustering / Segmentation using Latent Dirichlet Allocation (LDA):
Using 21 categorical and numeric features in a multivariate linear regression to find that 79% of a home price can be positively affected by a combination of certain features like location, square feet, condition and age of the home.
Examples of Tableau dashboards that have been created using various datasets
Repositories
303Natural Language Processing: A multi-headed model capable of detecting different types of online discussion toxicity like threats, obscenity, insults, and identity-based hate using Keras RNN LSTM and focal loss to address a hyper-imbalanced dataset.
Sentiment Analysis using Natural Language Processing (NLP) Multi-Classification Support Vector Modeling for & Clustering / Segmentation using Latent Dirichlet Allocation (LDA):
SQL & A/B / Hypothesis Testing to inform business intelligence
Using binomial classification to predict COVID-19 infection on a large dataset (>618K samples) with extreme imbalance and minority class (.13% of samples) as target. The final iteration is a manually tuned random forsest classifier with >95% accuracy and >64% recall.
Using 21 categorical and numeric features in a multivariate linear regression to find that 79% of a home price can be positively affected by a combination of certain features like location, square feet, condition and age of the home.
No description provided.
Examples of Tableau dashboards that have been created using various datasets
merging and concat
data exploration and visualization
ca
No description provided.
SQL & A/B / Hypothesis Testing to inform business intelligence
No description provided.
practicing data cleaning techniques
timeseries
Web Traffic Time Series Forecasting
No description provided.
Data collected from Maricopa County Recorder
multivariate linear regression
Tableau Dashboards
NLP to identify toxic or abusive language for online conversation using Keras / Deep Learning Models
Non-profit and operations analyst learning as much as she can about data science theory and application in hopes to one day use her superpowers for good.
No description provided.
No description provided.
Winter 2021 Data Science Intern Challenge
No description provided.
Capstone Project
No description provided.
No description provided.
No description provided.