GitHunt
AK

Akhand-Pratap-Tiwari/Automatic-Extractive-Text-Summarization-using-TF-IDF

Text Summarization using TF-IDF technique in Python.

Automatic extractive text summarization is the process of automatically creating a summary of a text document using algorithms. The most common algorithm used for this task is TF-IDF.

TF-IDF is a statistical measure that is used to evaluate how important a word is to a document. The importance of a word is determined by how often it appears in the document, and how often it appears in other documents.

The TF-IDF algorithm is used to create a vector of words that represent the importance of each word in the document. The length of the vector is the number of unique words in the document. The value of each element in the vector is the TF-IDF score of the corresponding word.

The TF-IDF algorithm is used to create a summary of a text document by selecting the most important sentences. The most important sentences are those that contain the most important words. The summary is created by selecting the sentences that contain the most important words and concatenating them.

There is only a single python file because it is that simple to implement this technique.

Languages

Jupyter Notebook80.6%Python19.4%

Contributors

Created November 17, 2022
Updated October 12, 2024
Akhand-Pratap-Tiwari/Automatic-Extractive-Text-Summarization-using-TF-IDF | GitHunt