"topic:pymupdf-fitz" — Search

44 results for “topic:pymupdf-fitz”

This Python-based tool allows for efficient comparison of two or more PDF documents, highlighting the differences between them. It extracts and compares the words in the PDFs, ignoring whitespace differences, and highlights the changed, added, or missing words.

Python100Updated 1 month ago

comparison-tooldifferences-detecteddifflibfitzmultiple-pdfspdfpdf-comparisonpdf-comparison-highlight-differencespymupdf-fitzpythontext-comparison

vickypandey14/Convert-PDF-into-Image-By-Python

This Python script converts each page of a PDF document into separate image files. It utilizes the PyMuPDF library (fitz) to handle PDF operations and the Python Imaging Library (PIL) for image processing.

Python31Updated 8 months ago

pdf-converterpymupdfpymupdf-fitzpythonpython-script

pawankumar94/graphscribe-table-extractor

Graphscribe is an intelligent, LLM-powered document understanding system designed to extract structured insights from complex visual content such as statistical diagrams, charts, and graphs.

Python30Updated 7 months ago

gemini-flashgenai-chatbotlangchainocr-recognitionpymupdfpymupdf-fitzqwen2-5tesseract-ocr

Kurama-90/GUI-PDF-to-Excel

PyQt5-based GUI application that allows users to convert PDF files into Excel files. The application provides multiple options for extracting data from PDFs, including tables, text, and OCR (Optical Character Recognition).

Python20Updated 2 months ago

dataeasyocrexcelguinumpyopencv-pythonpandaspdfpdf2imagepdfplumberpopplerpymupdf-fitzpyqt5python

das-amlan/PDF_Image_Extractor_Web_App

This is a simple web app that allows users to upload a PDF file, extract images from the PDF, and display the images in the web app.

Python23Updated 1 year ago

fitzflaskhtmlpymupdf-fitzpythonstreamlit

devbm7/QGen

Question Generator System

Python21Updated 7 months ago

jsonmlnlpnltkpandaspymupdf-fitzpython3pytorchregular-expressionssmtpspacystreamlitt5-largetransformerswikipedia-api

mcagriaksoy/diff_merge_pdf

A tool for compare, merge, display difference and make OCR between the PDFs.

Python21Updated 1 month ago

diff-tooldiff-tool-pdfocr-recognitionocr-text-readerpdf-comparisonpdf-document-processorpdf-generatorpdf-mergerpdf-ocrpdf-ocr-extractionpdf-viewerpdf-visual-testingpymupdf-fitzpyqt6-desktop-applicationx-ray-images

ifte110/Serach_all_pdfs_by_string

Search through all pdf files in a folder for a specific keyword or string of keywords.

Python20Updated 1 year ago

pdfsearchtoolpymupdf-fitzpython

HoustonAlexander/DATAFORGE

Toolkit for research admin task at NCSU; Compatible with WRS reports

Python10Updated 1 month ago

pandaspymupdf-fitzpythonregextkinterwin32

Sazizi2025/PDF-Founder

Are you short on time?! Can't you search all the PDFs one by one for the content you want?! Well, PDF-Founder is here...

Python10Updated 2 years ago

easy-to-usegraphicalguiimageimage-generatorpdfpdf-searchpdf-search-engineptlpymupdfpymupdf-fitzpysimpleguipythonrgbsnippingtesseracttesseract-ocr

ZobayerAkib/AI-Invoice-Analyzer

An AI-powered invoice and receipt analyzer that extracts structured invoice data from images (JPG/PNG) and PDF documents using a Large Language Model (LLM).

Python10Updated 1 week ago

fastapiimageinvoice-analysisllmopenai-apipdfpdf-text-extractionpymupdf-fitz

atthharvva/PDF-Form-Reader

This Python script extracts information from PDF forms using OCR (Optical Character Recognition) and saves the extracted data into an Excel file. It is particularly designed for processing forms with checkboxes and textual fields. The script can handle variations in form structure and allows for easy customization to accommodate other PDF form type

Python10Updated 3 months ago

csv-exportformsgraphical-checkboxesocr-text-readeropenpyxlpdf-document-processorpdf-formspillowpymupdf-fitzpython

MelinaNorton/journal-vetter

Python CLI & library for automated journal vetting — GPT‑4.1 summarization, YAML configuration, reproducible analysis.

Python10Updated 1 week ago

academic-journalschatgptclidocument-embeddingdocument-embeddingsgpt-4langchainlangchain-pythonllmllmsopenaipdf-processingpymupdfpymupdf-fitzpypipypi-packagepythonresearch-tooltext-summarization

ramyadjoshi/IntelliDoc-AI-Powered-Intelligent-Document-Analysis-System

IntelliDoc is an intelligent document understanding system that helps users extract, analyze, and query information from PDFs, scanned documents, images, and multilingual reports using OCR, AI, and Retrieval-Augmented Generation (RAG)

Python10Updated 3 weeks ago

artificial-intelligencefaiss-vector-databasegroq-apillamaopencvoptical-character-recognitionpillow-librarypymupdf-fitzpythonrag-pipelinestreamlittesseract-ocrtf-idf-vectorizer

City-of-Memphis-Wastewater/pdflinkcheck

Analyze all GoTo links, URI hyperlinks, URI file links, and TOC entries in a target PDF using a CLI and GUI wrapper for PyMuPDF. PyPI: https://pypi.org/project/pdflinkcheck Microsoft Store: https://apps.microsoft.com/detail/9n11hxvls1wg

Python10Updated 1 day ago

agplv3gotopdfpdf-linkspymupdfpymupdf-fitztoc

OtenMoten/pdf-alchemist

It's designed for transmuting PDFs into HTML. Harness the power of OCR, image processing, and web technologies to unlock the secrets within your PDF documents.

Python00Updated 1 year ago

beautifulsoup4dominatepdf-converterpdf-document-processorpillowpymupdf-fitzpythontdqmtesseract-ocr

ParthaPRay/pdf_text_extraction_json_section_subsection

This repo contains codes for extraction of PDF text to JSON to show section number, section title, section body content, footnote

Python00Updated 1 year ago

article-extractordocumentextractionjsonpdfpymupdf-fitzregextext

Sanschinu95/Maxcavator2.0

Maxcavator 2.0 is an intelligent, AI-native PDF Data Extraction and Retrieval-Augmented Generation (RAG) system. It fundamentally changes how you understand and interact with your PDF documents by instantly extracting complex structures (sections, tables, images), generating robust RAG indices.

JavaScript00Updated 6 days ago

ai-chatdocument-analysisfastapillama3ocrpdf-extractionpdf-parserpymupdf-fitzpythonragretrieval-augmented-generationsemantic-searchsentence-transformersvector-search

helgesander02/TKFruitMG

An ERP system that uses customtkinter as the GUI base, with a postgreSQL database and reportlab, win32print, and pymupdf-fitz design.

Python00Updated 2 years ago

customtkinterpostgresqlpymupdf-fitzreportlabwin32print

IglesiasT/comparador-pdfs

No description provided.

Python00Updated 1 year ago

pdf-comparisonpymupdf-fitzpython

Deepcoders30/AI-CHATPDF

ChatPDF is a web application that lets users upload PDFs and ask questions about their content.

TypeScript00Updated 7 months ago

faiss-vector-databasefastapigroq-integrationjavascriptlangchainpymupdf-fitzreactjstypescript

RishavKumarSinha/adobe-hackathon-solution

Solution for the Adobe India Hackathon 2025, Team - Codient (Team Leader - Gopal Ranjan, Team Members - Rishav Kumar Sinha)

Python00Updated 7 months ago

containerizationdockernlp-machine-learningpymupdf-fitzpython

saikaryekar/pdf-layout-mapper

A Python tool for extracting text regions from PDF files, visualizing them as bounding boxes, and exporting structured data in JSON format.

Python00Updated 4 months ago

bbxclipymupdf-fitzshapely

bilalhameed248/PDF-Document-Extraction

Python PDF-to-HTML Converter: Transforming PDF Documents into Structured HTML Tags. - Feb 2022 - Jun 2023

Python00Updated 2 years ago

documentextractionfitzparserparsingpdfpymupdfpymupdf-fitzpythonpython3

nngel/PDF-thumbnail-service

A production-ready FastAPI microservice that functions as a PDF thumbnail generator, converting the first page of PDF files to optimized PNG thumbnails.

Python01Updated 9 months ago

fastapiimagepdfpymupdf-fitzpythonthumbnail-generatorvercel

RomyJr/PDF_TXT_Word_research

This application simplifies PDF keyword searches, allowing users to easily find specific terms in files or folders. Results are displayed clearly, and the history feature enables quick review and filtering of past searches. Users can click on document links in the history to open them directly in the default PDF viewer.

Python00Updated 2 years ago

pymupdfpymupdf-fitzpyqt5

FrancisLauriano/chatsoftex

Plataforma desenvolvida em Python que visa automatizar e agilizar o processo de avaliação de projetos de inovação tecnológica, utilizando inteligência artificial e critérios padronizados com base na Lei do Bem.

Python00Updated 1 year ago

cryptographyfernetfirebaseflaskflask-jwt-extendedhugging-face-transformersnumpyopenaipdfplumberpostgresqlpyjwtpymupdf-fitzpypdf2pythonpytorchscikit-learnscipyspacysqlalchemytensorflow

ashutosh6500/Resume-Parser-AWS-Event-Driven-Workflow

This is simple event driven mini project based on different AWS services like Lambda,EC2,Dynamodb,S3,SNS etc

Python00Updated 2 years ago

awsaws-projectsevent-driven-architecturelambda-layerspymupdf-fitzresume-parser

Jatin-s16/Resume-check-portal-for-candidates

A Streamlit-based application that enables job seekers to evaluate and enhance their resumes by analyzing alignment with specific job descriptions, providing actionable insights for improvement.

Jupyter Notebook00Updated 11 months ago

Lazarokaua/Organiza-pasta-obsidian

Organização de arquivos para meu Obsidian

Python00Updated 7 months ago

google-gemini-apipymupdf-fitzpythonpython-dotenv

Page 1 of 2