GitHunt
QU

QuantumGhost98/football-data-scientist

⚽ Belgian Jupiler Pro League 2021/22 analysis — custom midfielder metrics using 1.1M+ StatsBomb events

⚽ Belgian Jupiler Pro League Analysis 2021/22

Python
Pandas
License

A deep-dive analysis of the 2021/22 Belgian Jupiler Pro League season using StatsBomb event-level data. This project explores team and player performance metrics, with a focus on midfielder evaluation and Club Brugge's championship-winning campaign.

Football Analytics


📊 Project Overview

An exploration of 1.1M+ match events from the Belgian top-flight, answering key performance questions and building custom midfielder evaluation metrics.

Key Insights

Finding Result
League's top shooter Deniz Undav (146 shots)
Highest team xG Club Brugge (76.91 xG)
Players booked (yellow cards) 379 unique players
Club Brugge's creative engine Noa Lang (55 right-foot shot assists)
Club Brugge's pressing leader Éder Álvarez Balanta (233 counterpressures)

Midfielder Metrics Developed

Five custom metrics to evaluate midfielders:

  1. Aerial Wins - Total headers won
  2. Aerial Win % - Efficiency in aerial duels
  3. Long Balls - Completed passes ≥35 yards
  4. Assisted xG - Expected goals from shot-creating passes
  5. Box Positioning - In-box touches during possession

🌟 Top Performers

🏆 Elite Midfielders by Category

Aerial Specialists:

  • Vinicius de Souza Costa (98 wins)
  • Julien De Sart (92 wins)

Long Ball Masters:

  • Josh Cullen (233 completed)
  • Julien De Sart (196 completed)

Complete Playmakers:

  • Hans Vanaken (5.01 xG assisted, 327 box touches)
  • Lior Refaelov (5.71 xG assisted, 223 box touches)

🗂️ Repository Structure

belgian-football-analysis/
│
├── 📓 notebooks/
│   └── jupiler_league_analysis.ipynb   # Main analysis
│
├── 📊 data/
│   ├── matches-events-bel1-2122.parquet
│   └── data-events-matches.zip
│
├── 📁 docs/
│   ├── documentation-events.pdf
│   └── documentation-matches.pdf
│
├── 📈 outputs/
│   ├── top_shooters.png
│   ├── team_xg.png
│   ├── yellow_cards.png
│   ├── cb_shot_assists.png
│   ├── cb_counterpressures.png
│   ├── midfielder_radar.png
│   ├── midfielder_rankings.csv
│   └── results_summary.json
│
├── README.md
├── requirements.txt
└── .gitignore

🚀 Quick Start

Prerequisites

  • Python 3.9+
  • pip

Installation

# Clone the repository
git clone https://github.com/yourusername/belgian-football-analysis.git
cd belgian-football-analysis

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Run the Analysis

jupyter notebook notebooks/jupiler_league_analysis.ipynb

📈 Visualizations

The analysis includes several visualizations:

  • 📊 Shot Leaders - Top 10 shooters across the league
  • 📈 Team xG Rankings - All 18 teams by expected goals
  • 🎯 Midfielder Radar - Multi-dimensional player profiles
  • 🔥 Pressing Intensity - Counterpressure analysis
  • 📍 Attacking Positions - Box involvement metrics

📦 Data Source

Attribute Value
Provider StatsBomb
Competition Belgian Jupiler Pro League
Season 2021/2022
Total Events 1,119,695
Columns 161
Teams 18

🛠️ Tech Stack

  • pandas - Data manipulation
  • NumPy - Numerical operations
  • matplotlib - Visualizations
  • seaborn - Statistical graphics
  • pyarrow - Parquet handling

📝 License

MIT License - see LICENSE for details.


🙏 Acknowledgments

  • StatsBomb for the open event data
  • Belgian Jupiler Pro League for a great 2021/22 season
  • Club Brugge - Champions! 🏆

Built with ❤️ for football analytics