mhmdkardosha/CAT-Reloaded-2025-Data-Science-Roadmap
Roadmap for Data Science circle associated with CAT Reloaded.
Data Science Roadmap 2025
This roadmap is maintained by Mohamed Kardosha.
1. Introduction
It's a multidisciplinary field that looks at raw and structured data sets and provides potentially actionable insights. The field of data science looks at ensuring that we are asking the right questions as opposed to finding exact answers. Data Scientists require skillsets centered on Computer Science, Mathematics, and Statistics. Data Scientists use several unique techniques to analyze data such as machine learning, trends, linear regressions, and predictive modeling. The tools that data scientists use to apply these techniques include Python and R.
- These are small differences between each job title:
- For more details about each job title, you can see this Arabic video or this other video.
2. Levels
The roadmap is divided into 4 main levels, each level will be divided into weeks and each week will have a set of tasks to be completed. We will try to provide task links one by one when it's finished. Each level is designed to be completed within 1-3 months on average, however, the time taken to complete the roadmap may vary depending on the individual.
- Entry: Good introduction to the field.
- Beginner: Data scientist toolkit and foundations.
- Intermediate: Dive deeper and solidly understand and work with data.
- Advanced A: Mathematics and Machine Learning.
- Advanced B: Deep Learning and specializing in a specific field.
2.1. Entry Level
It includes the following topics:
- Data Literacy
- Understanding Data Science
- Introduction to Statistics
- Python Basics
- OOP in Python
| π Phase | π Topics | π Resources | β Tasks |
|---|---|---|---|
| Week 1 |
|
π Base resourcesπ‘ Alternative resources |
|
| Week 2 |
|
π Base resourcesπ‘ Alternative resources |
|
| Week 3 | π Python Basics |
π Base resourcesπ‘ Alternative resources |
|
| Week 4 | π OOP in Python |
π Base resourcesπ‘ Alternative resources |
|
2.2. Beginner Level
It includes the following topics:
- NumPy
- Pandas
- Matplotlib
- Seaborn
- Power BI
- Git & GitHub
| π Phase | π Topics | π Resources | β Tasks |
|---|---|---|---|
| Week 1 | π’ NumPy |
π Base resourcesπ‘ Alternative resources |
|
| Week 2 | πΌ Pandas |
π Base resourcesπ‘ Alternative resources |
|
| Week 3 | γ½οΈ Matplotlib |
π Base resourcesπ‘ Alternative resources |
|
| Week 4 | π Seaborn |
π Base resourcesπ‘ Alternative resources |
|
| Week 5 | π Power BI |
π Base resourcesπ‘ Alternative resources |
|
| Week 6 | π Git & GitHub |
π Resources |
|
2.3. Intermediate Level
It includes the following topics:
- Regular Expressions (RegEx)
- Data Cleaning
- Feature Engineering
- Exploratory Data Analysis
- Web Scraping
- Structured Query Language (SQL)
| π Phase | π Topics | π Resources | β Tasks |
|---|---|---|---|
| Week 1 | π£ Regular Expressions (Regex) |
π Base resourcesπ‘ Alternative resources |
|
| Week 2 | π§Ή Data Cleaning |
π Base resources |
|
| Week 3 | π Feature Engineering |
π Base resources |
|
| Week 4 | π Exploratory Data Analysis (EDA) |
π Base resourcesπ‘ Alternative resources |
|
| Week 5 | πΈ Web Scraping |
π Base resources |
|
| Week 6 | π Structured Query Language (SQL) |
π Base resourcesπ‘ Alternative resources |
|
2.4. Advanced A Level
It includes the following topics:
- Math required for Machine Learning:
- Linear Algebra
- Multi-variate Calculus
- Machine Learning Algorithms:
- Supervised Learning
- Unsupervised Learning
- Ensemble Learning
- Model Evaluation and Selection
- APIs
| π Phase | π Topics | π Resources | β Tasks |
|---|---|---|---|
| Week 1 | π’ Linear Algebra |
π Base resourcesπ‘ Alternative resources |
|
| Week 2 | π Multi-variate Calculus |
π Base resourcesπ‘ Alternative resources |
|
In this stage you are ready to dive deep in the world of Machine Learning. The following resources are general and not divided into categories or weeks, you can follow them in parallel with the base resources in the weeks as a supplementary resource if you want.
- ΩStateQuest | Machine Learning
- Data School | Machine Learning
- Machine Learning from Scratch | YouTube playlist
- Machine Learning Mastery
- Udacity | Intro to Machine Learning
- Sentdex | Machine Learning with Python
- Hesham Asem | YouTube Arabic playlists
- Machine Learning in Arabic
Now we let's continue the roadmap into weeks.
| π Phase | π Topics | π Resources | β Tasks |
|---|---|---|---|
| Weeks 3 - 4 | π΅οΈ Supervised Learning |
π Resources |
|
| Weeks 5 - 14 | π Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition |
|
|
| Weeks 15 - 16 |
|
π Resources |
|
| Weeks 17 - 18 |
|
π Resources |
|
| Week 20 | π APIs |
π Resources |
|
In this stage, you now have strong basics about machine learning algorithms and how it works. Also, you learned about APIs and how to use them. Now you are ready to train models, practice on datasets, and make some projects involving the algorithms you learned. You may also make a machine learning algorithm from scratch; it would be great practice to understand the algorithms more.
2.5. Advanced B Level
In this stage, you will enter the Deep Learning and NLP World. It's divided into three phases:
- Phase 1: Basic concepts of Deep Learning (NN, CNN, RNN, Backpropagation, Optimizers, etc.)
- Phase 2: Transformers and LLMs.
- Phase 3: NLP fields.
2.5.1. Phase 1: Basic Concepts of Deep Learning
| π Phase | π Topics | π Resources | β Tasks |
|---|---|---|---|
| Weeks 1 - 3 | π§ Basic concepts of Deep Learning |
π Resourcesπ‘ Alternative resources |
|
| Weeks 3 - 4 |
|
π Resourcesπ‘ Alternative resources |
|
| Weeks 5 |
|
π Resources |
|
| Weeks 6 - 8 | π Convolutional Neural Networks (CNN) |
π Resourcesπ‘ Alternative resources |
|
| Weeks 9 - 11 | β³ Recurrent Neural Networks (RNN) |
π Resourcesπ‘ Alternative resources |
|
2.5.2. Phase 2: Transformers and LLMs
| π Phase | π Topics | π Resources | β Tasks |
|---|---|---|---|
| Weeks 12 - 13 | π€ Transformers |
π Resourcesπ‘ Required Projects |
|
| Weeks 14 - 15 | Φ Large Language Models (LLMs) |
π Resourcesπ‘ Alternative Resources |
|
2.5.3. Phase 3: NLP fields
- There are many sub-fields in this amazing field (NLP), one of them is RAG.
- At first you need to learn LangChain and LangGraph.
- Also we recommend to you Abu Bakr Soliman's course. In this course, you will learn a lot of concepts and tools to build a solid project like (fastapΩ, docker, MongoDB, and MVC Design pattern).
More to be added and we will try to update this roadmap with the latest resources.
If you need to ask any questions or clarifications for some topic, don't hesitate to ask us:
- Mohamed Kardosha: Data Science circle Head.
- Doaa Helal: Data Science circle Vice-Head.
A special thanks goes out to all the contributors who supported the creation of this roadmap and the preparation of its tasks. Deep appreciation as well to the dedicated supervisors for their commitment and hard work throughout the year:
