25 results for “topic:pdf2text”
converts binary PDF to JSON and text, for server-side PDF processing and command-line use. Zero dependency.
Pure TypeScript, cross-platform module for extracting text, images, and tabular data from PDFs. Run 🤗 directly in your browser or in Node.js
The code base of the front-end of nocodefunctions.com
We present Ypdf, a PDF document processing application that combines the best features of existing solutions and provides the most popular and requested functionality for free to its users.
A collection of scripts to "help" you with your programming exams and assignments.
Simple and Useful Automation Tools built with the help of modules available with Python published at PyPI.
Natural language processing tools developed by the World Bank's DECAT unit. A suite of text preprocessing and cleaning algorithms for NLP analysis and modeling.
Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents
A API in .Net Core to extract documents OCR with many libs linux
A simple, free tool for extracting text from scanned PDFs and images using OCR, and converting images to PDFs. It processes files locally in the browser, ensuring privacy and security while enabling users to effortlessly convert documents and images into editable text or PDF format.
Newspaper mining and the analysis of the results using python. Cleaning the text using OCR.
PdfReg is a web tool, which gets text at selected regions of pdf document.
pdf to text
Interactively extract text from multi-column PDFs
:heavy_check_mark: A Python Flask API to manage PDF files.
Extract text from pdf using ocr
A Streamlit-based app that allows users to upload PDFs or images, extract text, and engage in interactive Q&A. Using Google Generative AI, this app enables insightful conversations based on document contents. Ideal for those seeking quick answers from their files in a simple, intuitive interface.
io for nocodefunctions: csv, txt, pdf, and xlsx so far
A lightweight Python-based Software Package for daily use
This project converts PDF files into audiobooks with synchronized subtitles in .vtt format. It uses FastAPI for the backend and Microsoft's Edge TTS for text-to-speech conversion.
Web application for information extraction and named entity recognition for PDF files (work-in-progress).
A simple CLI to to convert PDF files into TXT using OCR
Building an OCR pipeline for PDF to TXT
Outil OCR permettant d’extraire et de structurer du texte à partir d’images et de PDF scannés (export en .docx et .txt) — prise en charge du français et de l’anglais
Data Center Advanced Walkthrough. Insert data from a PDF file into MySQL database