47 results for “topic:natural-language-processing-nlp”
DLO8012: Natural Language Processing [NLP] & CSL804: Computational Lab - II | BE Semester VIII | Computer Engineering
This project analyzes IMDb movie reviews to classify sentiments as positive or negative. It includes text preprocessing, feature extraction using TF-IDF and CountVectorizer, training Logistic Regression and Naive Bayes classifiers, and visualizing frequent words with WordClouds.
The results are drawn from experiments on the classification of legal documents using LLMs in a real-world institutional setting
The excellent Image captioning model using the DETR inspired architecture
News Category Classification using AG News dataset. Implements text preprocessing, TF-IDF vectorization, and trains Logistic Regression and a Neural Network to classify news into World, Sports, Business, and Sci/Tech categories. Includes data visualization and model evaluation.
Brainlink is an interactive web application designed to foster knowledge sharing and collaborative learning.
💬 It uses NLP techniques to classify reviews as positive, neutral or negative, providing valuable insights into customer feedback.
PropInsight is an AI-powered property inspection report generator that utilizes LLM models to analyze property types and observed issues, generating comprehensive and data-driven reports for smarter decision-making.
Fake News Detection system using NLP techniques and TF-IDF vectorization to classify news articles as real or fake. Trained with Logistic Regression and SVM on the Fake and Real News Dataset, with preprocessing, evaluation metrics, and word cloud visualizations.
Named Entity Recognition (NER) on news articles using rule-based and spaCy models. Includes entity extraction, visualizations with displaCy, and comparison of small vs large spaCy models for analysis and insights.
NLP system that analyzes legal documents, identifies compliance risks, and generates contract summaries using transformer-based legal AI.
A PDF Reader application powered by AI, allowing users to upload PDF documents and extract meaningful information using advanced NLP models. Built with Streamlit, Transformers, and Langchain, this app provides a seamless interface for interacting with and analyzing PDF content.
A deep learning-based AI tool for correcting grammatical and spelling errors using TensorFlow. Features LSTM-based sequence-to-sequence modeling, data preprocessing, and real-time correction capabilities.
Reproducible pipeline and dataset artifacts for large-scale topic modeling of 52,409 astrobiology-related ArXiv preprints (1996-2025) using BERTopic and comparative Top2Vec analysis.
NLP-based fake news detection using pre-trained Word2Vec embeddings and semantic feature engineering. Includes deep EDA, entity-based features, interpretable metrics, and evaluation across multiple models including Logistic Regression, Decision Tree, and Random Forest.
Machine learning project for detecting fake news articles in the Palestinian context using natural language processing (NLP), data preprocessing, and classification models. Includes text cleaning, vectorization, model training, and evaluation.
A new package facilitates extracting a concise, structured summary from user-provided news headlines or brief texts by utilizing pattern matching and LLM interactions. This tool aims to help researche
This repo includes a generalized preprocessing pipeline for text data in NLP tasks.
FiscalTone: Text Analysis of Fiscal Policy Communications from Peru's Fiscal Council. Pipeline for scraping, processing, and performing text analysis on Consejo Fiscal (cf.gob.pe) documents to score fiscal risk sentiment using LLM-based classification (GPT-4o). Produces a novel Fiscal Tone Index from time-series analysis of official reports
A new package that processes news headlines or short text inputs to generate structured summaries of events, such as service disruptions or incidents. It uses an LLM to extract key details like the co
A new package that analyzes technical arguments and extracts structured summaries from text discussions about infrastructure-as-code practices. It takes user-provided text (such as forum posts, articl
Production-ready RAG (Retrieval Augmented Generation) system built from scratch using LangChain, OpenAI, and FAISS. Features document indexing, semantic search, and AI-powered Q&A with source attribution.
Assignments and final project for the graduate course Natural Language Processing at CIMAT (Spring 2025). Includes classical and neural methods for text classification, sequence modeling, and user profiling.
Companion repository for the paper "Methodological Trends in Psychology Research: Analyzing Abstracts with NLP and Machine Learning." Includes code, term glossary, and workflows for clustering and text analysis.
This repository contains a next-word prediction model using TensorFlow and Keras to generate text in the style of Shakespeare's sonnets.
Python-based tool for scraping web essays and articles, performing dictionary-driven sentiment analysis, and computing readability and linguistic metrics. Converts qualitative text into structured quantitative insights with Excel-based input/output.
Spam Email Detection using Naive Bayes uses Multinomial, Gaussian, and Bernoulli models. Emails are converted using Bag of Words or TF-IDF. Multinomial NB is preferred for better accuracy with text data.
A stacking ensemble model is built for the purpose of detecting fake reviews. The ensemble is a comination of LogisticRegression and MultinomialNB
A Streamlit application designed to simplify complex medical reports into easy-to-understand language for patients. The system leverages a fine-tuned FLAN-T5 model with LoRA adapters, combined with OCR text extraction (Tesseract) and NLP preprocessing (spaCy), to deliver accurate and accessible medical explanations through a clean interface
AI-powered sentiment analysis for hotel reviews using scikit-learn and DistilBERT. Features both traditional ML and transformer models with an interactive Streamlit web app for real-time predictions. Achieves high accuracy in classifying reviews as Negative, Neutral, or Positive.