amrkh97/Lipify-LipReading
Our project's source code and documentation as part of the requirements for Graduation Project-2 (CCEN481) in Computer Engineering Program at Cairo University Faculty of Engineering
Lipify - A Lip Reading Application
Project Dependencies:
- Python>=3.7.1
- tensorflow>=2.1.0
- opencv-python>=4.2.0
- dlib
- moviepy>=1.0.1
- numpy>=1.18.1
- Pillow
- matplotlib
- tqdm
- pyDot
- seaborn
- scikit-learn
- imutils>=0.5.3
Note: All Dependencies can be found inside 'setup.py'
Project's Dataset Structure:
- GP DataSet/
| --> align/
| --> video/ - Videos-After-Extraction/
| --> S1/
| --> ....
| --> S20/ - New-DataSet-Videos/
| --> S1/
| --> ....
| --> S20/ - S1/
| --> Adverb/
|
--> Alphabet/
|
--> Colors/
|
--> Commands/
|
--> Numbers/
|
--> Prepositions/
Dataset Info:
We use the GRID Corpus dataset which is publicly available at this link
You can download the dataset using our script: GridCorpus-Downloader.sh
which was adapted from the code provided here
To Download please run the following line of code in your terminal:
bash GridCorpus-Downloader.sh FirstSpeaker SecondSpeaker
where FirstSpeaker and SecondSpeaker are integers for the number of speakers to download
- NOTE: Speaker 21 is missing from the GRID Corpus dataset due to technical issues.
Datset Segmentation Steps:
- Run DatasetSegmentation.py
- Run Pre-Processing/frameManipulator.py
* After running the above files, all resultant videos will have 30 FPS and 1 second long.
CNN Models Training Steps:
-
Model codes can be found in the directory "NN-Models"
-
First you will need to change the common path
value to the directory of your training and test data. -
Run Each network to start training.
-
Early stopping was used to help stop
the training of the model at its optimum validation accuracy. -
Resultant accuracies after training on the data can be found in:
Project Accuracies
or in the following illustration:

CNN Architecture:
All of our networks have the same architecture with the only
difference being the output layer, As shown in:
TODOs:
Dataset preprocessing moduleInitial Convolutional Neural networks' architectureFacial detection algorithmOptimization of the networks' architecturesUnittesting of project files- Proper documentation for the whole project
