GitHunt
MR

MrDoppelganger/UTFSM-INF280-Computational-Statistics

Academic repository for INF-280 Computational Statistics at UTFSM. Features Exploratory Data Analysis (EDA), Statistical Inference, Hypothesis Testing, and Regression Models using Python and Jupyter Notebooks.

Computational Statistics (INF-280)

University: Universidad Técnica Federico Santa María (UTFSM)
Program: Civil Engineering in Informatics
Course: Estadistica Computacional (INF-280) [2° Semester/2025]


📄 Overview

This repository contains the laboratory assignments and practical coursework developed for the
ComputationalStatistics (INF-280) course. The primary objective of this coursework was to apply statistical
concepts and probability theory to solve engineering problems, utilizing computational tools for data analysis
and inference.

The projects focus on the full data analysis lifecycle: from Exploratory Data Analysis (EDA) and visualization
to advanced Statistical Inference and Regression Modeling.

🧪 Laboratory Assignments

The coursework is divided into practical laboratories implemented in Jupyter Notebooks, covering the following
course units:

Lab 1: Exploratory Data Analysis (EDA)

Focuses on Descriptive Statistics to summarize and visualize data characteristics.

  • key Concepts: Data cleaning, measures of central tendency and dispersion, and distribution visualization
    (histograms, boxplots).
  • Goal: To obtain preliminary insights and detect patterns or anomalies in raw datasets.

Lab 2: Probability & Statistical Inference

Centers on probability distributions and the foundations of inferential statistics.

  • Key Concepts: Random variables, probability density functions, sampling distributions, and confidence
    intervals.
  • Goal: To estimate population parameters based on sample data and assess the reliability of these
    estimates.

Lab 3: Hypothesis Testing & Regression Models

Applies advanced inference techniques and predictive modeling.

  • Key Concepts: Null hypothesis significance testing (NHST), p-values, and Linear Regression models
    (Simple and Multiple).
  • Goal: To validate assumptions about data and construct mathematical models to predict outcomes based on
    independent variables.

🛠 Tech Stack

The statistical analysis was performed using Python and its scientific computing ecosystem:

  • Core Computing: NumPy, Pandas.
  • Visualization: Matplotlib, Seaborn.
  • Statistical Analysis: SciPy, Statsmodels.
  • Environment: Jupyter Notebooks (Anaconda/Miniconda distribution).

📚 Key Competencies & Learning Outcomes

The work in this repository demonstrates the following competencies:

  • Statistical Reasoning: The ability to analyze data and propose solutions based on probabilistic logic rather
    than intuition.
  • Data Modelling: Applying regression techniques to explain relationships between variables and forecast
    future trends.
  • Scientific Computing: Utilizing specialized software to perform complex calculations and generate reproducible
    research.
  • Inference: Drawing conclusions about large populations based on limited sample data with calculated margins
    of error.