QU
QuantumGhost98/football-data-scientist
⚽ Belgian Jupiler Pro League 2021/22 analysis — custom midfielder metrics using 1.1M+ StatsBomb events
⚽ Belgian Jupiler Pro League Analysis 2021/22
A deep-dive analysis of the 2021/22 Belgian Jupiler Pro League season using StatsBomb event-level data. This project explores team and player performance metrics, with a focus on midfielder evaluation and Club Brugge's championship-winning campaign.
📊 Project Overview
An exploration of 1.1M+ match events from the Belgian top-flight, answering key performance questions and building custom midfielder evaluation metrics.
Key Insights
| Finding | Result |
|---|---|
| League's top shooter | Deniz Undav (146 shots) |
| Highest team xG | Club Brugge (76.91 xG) |
| Players booked (yellow cards) | 379 unique players |
| Club Brugge's creative engine | Noa Lang (55 right-foot shot assists) |
| Club Brugge's pressing leader | Éder Álvarez Balanta (233 counterpressures) |
Midfielder Metrics Developed
Five custom metrics to evaluate midfielders:
- Aerial Wins - Total headers won
- Aerial Win % - Efficiency in aerial duels
- Long Balls - Completed passes ≥35 yards
- Assisted xG - Expected goals from shot-creating passes
- Box Positioning - In-box touches during possession
🌟 Top Performers
🏆 Elite Midfielders by Category
Aerial Specialists:
- Vinicius de Souza Costa (98 wins)
- Julien De Sart (92 wins)
Long Ball Masters:
- Josh Cullen (233 completed)
- Julien De Sart (196 completed)
Complete Playmakers:
- Hans Vanaken (5.01 xG assisted, 327 box touches)
- Lior Refaelov (5.71 xG assisted, 223 box touches)
🗂️ Repository Structure
belgian-football-analysis/
│
├── 📓 notebooks/
│ └── jupiler_league_analysis.ipynb # Main analysis
│
├── 📊 data/
│ ├── matches-events-bel1-2122.parquet
│ └── data-events-matches.zip
│
├── 📁 docs/
│ ├── documentation-events.pdf
│ └── documentation-matches.pdf
│
├── 📈 outputs/
│ ├── top_shooters.png
│ ├── team_xg.png
│ ├── yellow_cards.png
│ ├── cb_shot_assists.png
│ ├── cb_counterpressures.png
│ ├── midfielder_radar.png
│ ├── midfielder_rankings.csv
│ └── results_summary.json
│
├── README.md
├── requirements.txt
└── .gitignore
🚀 Quick Start
Prerequisites
- Python 3.9+
- pip
Installation
# Clone the repository
git clone https://github.com/yourusername/belgian-football-analysis.git
cd belgian-football-analysis
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtRun the Analysis
jupyter notebook notebooks/jupiler_league_analysis.ipynb📈 Visualizations
The analysis includes several visualizations:
- 📊 Shot Leaders - Top 10 shooters across the league
- 📈 Team xG Rankings - All 18 teams by expected goals
- 🎯 Midfielder Radar - Multi-dimensional player profiles
- 🔥 Pressing Intensity - Counterpressure analysis
- 📍 Attacking Positions - Box involvement metrics
📦 Data Source
| Attribute | Value |
|---|---|
| Provider | StatsBomb |
| Competition | Belgian Jupiler Pro League |
| Season | 2021/2022 |
| Total Events | 1,119,695 |
| Columns | 161 |
| Teams | 18 |
🛠️ Tech Stack
- pandas - Data manipulation
- NumPy - Numerical operations
- matplotlib - Visualizations
- seaborn - Statistical graphics
- pyarrow - Parquet handling
📝 License
MIT License - see LICENSE for details.
🙏 Acknowledgments
- StatsBomb for the open event data
- Belgian Jupiler Pro League for a great 2021/22 season
- Club Brugge - Champions! 🏆
Built with ❤️ for football analytics