Abdullah321Umar/CodeAlpha_Web-Scraping-Project1
๐ FIFA Web Scraping ๐ Automated Python script to scrape FIFA World Rankings from the official FIFA website. Uses Selenium to handle dynamic pages and BeautifulSoup for parsing HTML content. Collects team names, ranks, points, and other stats automatically. Stores the data in a structured CSV file for analysis or visualization.
๐ FIFA Data Web Scraping Project 1 โฝ
๐ Project Overview
This project is an automated web scraping solution to extract FIFA World Rankings and team statistics from the official FIFA website.
The goal is to collect, clean, and store football data for analysis, visualization, or research purposes.
Key Highlights:
- Automated extraction of FIFA rankings and team information
- Data cleaning and structured storage using Pandas
- Integration of Selenium and BeautifulSoup for dynamic content
- Exporting data to CSV for easy analysis
๐ ๏ธ Tools & Technologies
| Technology | Purpose |
|---|---|
| Python ๐ | Scripting and data handling |
| Selenium โก | Browser automation for dynamic content |
| BeautifulSoup ๐ฒ | HTML parsing and data extraction |
| Pandas ๐ | Data structuring and CSV export |
| ChromeDriver ๐ | Browser control for Selenium |
| Jupyter Notebook ๐ | Development and testing environment |
๐งฉ Project Workflow
1๏ธโฃ Problem Identification
Manual extraction of FIFA rankings is time-consuming and prone to errors.
This project automates the process to:
- Collect FIFA World Rankings
- Capture team names, ranks, points, and country codes
- Export the data in a structured format for analysis
2๏ธโฃ Web Scraping Strategy
Dynamic Content Handling
- Use Selenium to open and interact with FIFAโs dynamic pages
- Wait for tables to fully load before parsing
HTML Parsing
- Use BeautifulSoup to extract:
- Team names ๐ท๏ธ
- Ranking positions ๐ฅ๐ฅ๐ฅ
- Points and statistics ๐
๐ง Code Structure
a) Importing Libraries
import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from bs4 import BeautifulSoupb) Setting up Selenium WebDriver
service = Service("path_to_chromedriver.exe")
options = webdriver.ChromeOptions()
options.add_argument("--start-maximized")
driver = webdriver.Chrome(service=service, options=options)c) Navigating FIFA Rankings Page
url = "https://www.fifa.com/fifa-world-ranking/"
driver.get(url)
time.sleep(5) # Wait for dynamic content to loadd) Parsing Rankings Table
soup = BeautifulSoup(driver.page_source, "html.parser")
teams = []
for row in soup.find_all("tr", class_="ranking-row"):
rank = row.find("td", class_="rank").text.strip()
team = row.find("td", class_="team-name").text.strip()
points = row.find("td", class_="points").text.strip()
teams.append([rank, team, points])e) Saving Data to CSV
df = pd.DataFrame(teams, columns=["Rank", "Team", "Points"])
df.to_csv("FIFA_Rankings.csv", index=False)โก Challenges & Solutions
| Challenge | Solution |
|---|---|
| Dynamic content loading | Added time.sleep() and Selenium waits |
| Complex HTML structure | Used browser inspect tools to locate elements |
| Missing data | Added checks to skip empty rows |
| Large dataset | Stored results in CSV for structured analysis |
๐ Output & Results
- Successfully scraped all FIFA-ranked teams โ
- Data exported to FIFA_Rankings.csv ๐
Top 5 Teams
| Rank | Team | Points |
|---|---|---|
| 1 | Argentina | 1841 |
| 2 | France | 1827 |
| 3 | Brazil | 1818 |
| 4 | Belgium | 1778 |
| 5 | England | 1769 |
๐ Future Improvements
- Schedule automatic scraping for real-time updates
- Visualize rankings using matplotlib or seaborn ๐
- Store historical data in a database for trend analysis
- Extend scraping to include team stats, goals, and player rankings
๐ฏ Learning Outcomes
- Hands-on experience with Selenium and BeautifulSoup integration
- Understanding dynamic web content and HTML parsing
- Improved Python, automation, and data handling skills
- Learned to handle real-world web scraping challenges
โ Conclusion
This project demonstrates the ability to automate FIFA ranking extraction, producing structured datasets for analysis or reporting.
It showcases skills in Python programming, web scraping, and data management, useful for sports analytics, data science, and research projects.
