BH

bhavyabhagerathi/Forecasting-Pharma-Bounce-Rate

This time-series-analysis project was completed as a part of Machine Learning Internshop at 360DigitMG.

Forecasting-Pharma-Bounce-Rate

Steps involved

Importing Libraries

The code starts by importing necessary libraries like pandas, numpy, stats models, dtale, matplotlib, etc. These libraries are commonly used for data manipulation, visualization, and time series analysis.

Reading Data

The code reads data from an Excel file named "Projectfinaldata.xlsx" using pandas and stores it in a DataFrame called "data."

Data Exploration

The code then performs some data exploration steps like checking for duplicates in the data and removing them. It also checks for missing values and drops rows with missing values.

Data Preprocessing

The code converts the "Dateofbill" column from an object (string) to a datetime format and sorts the data based on the date.

Auto EDA using dtale

The code uses the dtale library to create an interactive web-based exploratory data analysis (EDA) dashboard for the preprocessed data. This helps in understanding the data visually.

Time Series Analysis

The code focuses on time series analysis for the top 5 drugs in the dataset.

Top 5 drugs

"SODIUM CHLORIDE IVF 100ML",
"SEVOFLURANE 99.97%",
"SODIUM CHLORIDE 0.9%",
"ONDANSETRON 2MG/ML" and
"MULTIPLE ELECTROLYTES 500ML IVF"

Resample

Data is converted into monthly segments.

Decomposition

It decomposes the time series data for each drug using the seasonal decompose function from stats models. The decomposition helps in understanding the trend, seasonality, and residual components of the time series.

Autocorrelation and Partial Autocorrelation (ACF and PACF) Plots:

The code plots the ACF and PACF plots to determine the order of the Autoregressive (AR) and Moving Average (MA) components for each drug's time series.

Auto ARIMA Model:

The code uses the pmdarima library to automatically find the best ARIMA model for each drug's time series based on the AIC (Akaike Information Criterion) values.

Model Fitting and Forecasting:

The code fits the ARIMA models to each drug's time series data and makes future predictions for the next 12 months.

Model Evaluation:

The code calculates the Mean Absolute Percentage Error (MAPE) to evaluate the accuracy of the ARIMA models on the test data.

Saving and Loading Models

The code saves the trained ARIMA models for each drug using the save method from stats models. Later, it loads these models for making future predictions.

Final Forecasting

The code uses the saved models to make predictions for the next 12 months for each drug.

On this page

Contributors

Created November 15, 2023

Updated November 15, 2023