"topic:pdf2text" — Search

25 results for “topic:pdf2text”

converts binary PDF to JSON and text, for server-side PDF processing and command-line use. Zero dependency.

jsonpdfpdf-converterpdf-formpdf-textpdf2formpdf2jsonpdf2text

Pure TypeScript, cross-platform module for extracting text, images, and tabular data from PDFs. Run 🤗 directly in your browser or in Node.js

TypeScript15615Updated 6 hours ago

pdfpdf-parsepdf-parserpdf-screenshotpdf-tablepdf-thumbnailpdf-to-imagepdf-to-textpdf-toolspdf-utilspdf-viewerpdf2imagepdf2jsonpdf2picpdf2textpdfjspdfjs-distturkey

seinecle/nocodefunctions-web-app

The code base of the front-end of nocodefunctions.com

Java427Updated 14 hours ago

data-processingdata-sciencejakarta-facesjavanetwork-analysisnlpnocodepdf-to-textpdf2textsentiment-analysistext-miningtopic-modelingwebapp

yakovypg/Ypdf

We present Ypdf, a PDF document processing application that combines the best features of existing solutions and provides the most popular and requested functionality for free to its users.

C#245Updated 4 months ago

compress-pdfcrop-pdfcross-platform-pdfdivide-pdfimage2pdfmerge-pdfpage-numbers-pdfpdfpdf-converterpdf-passwordpdf-toolspdf-watermarkpdf2imagepdf2textremove-pages-pdfreorder-pdfrotate-pdfsplit-pdftext2pdf

TheLime1/CheatoMate

A collection of scripts to "help" you with your programming exams and assignments.

Python171Updated 5 months ago

aiassignmentchatcheatcheatingcodebaseexamimage2textnetwork-cardpdf2text

chiraag-kakar/PyAutomation

Simple and Useful Automation Tools built with the help of modules available with Python published at PyPI.

Python111Updated 1 year ago

beautifulsoup4pdf2textpypi-packagespython-automationpython3regexregexp-searchrequeststruth-table-generatorttgworldometer-scrapingworldometers

worldbank/wb-nlp-tools

Natural language processing tools developed by the World Bank's DECAT unit. A suite of text preprocessing and cleaning algorithms for NLP analysis and modeling.

Python107Updated 1 year ago

gensimlangdetectnlpnltkpdf2textpythonspacytext-mining

andrealenzi11/py-poppleract

Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents

Python102Updated 9 months ago

ocroptical-character-recognitionpdf-readerpdf-splittingpdf-to-textpdf2textpdftotextpopplerpoppleractpy-poppleracttesseracttesseract-ocrtext-extraction

StephanyBatista/ExtractOcrApi

A API in .Net Core to extract documents OCR with many libs linux

C#73Updated 1 year ago

aspnetcoredoc2txtocrpdf2texttesseract

AzozzALFiras/Pdf-OCR

A simple, free tool for extracting text from scanned PDFs and images using OCR, and converting images to PDFs. It processes files locally in the browser, ensuring privacy and security while enabling users to effortlessly convert documents and images into editable text or PDF format.

HTML61Updated 8 months ago

azozzalfirasimage-ocrocrpdf-ocrpdf2textpdf2txt

TanishqChamoli/Newspaper_Mining

Newspaper mining and the analysis of the results using python. Cleaning the text using OCR.

Python50Updated 4 months ago

data-sciencedatasetminingnewspapernewspaper-miningocrpdf2textpython3tesseract-ocrtoolwebcrawlingwget

imesut/PdfRegArchived

PdfReg is a web tool, which gets text at selected regions of pdf document.

JavaScript41Updated 1 year ago

pdfpdf-converterpdf-viewerpdf2textpdfjs

views63/pdf2text

pdf to text

Rust20Updated 6 years ago

pdf2textpdftotextrust

DrMcCoy/pdftextorizer

Interactively extract text from multi-column PDFs

Python20Updated 1 year ago

guipdfpdf-extractorpdf-filespdf2textpdftotextpyqt5qt5

senavs/pdfto

:heavy_check_mark: A Python Flask API to manage PDF files.

Python10Updated 6 years ago

apiapi-restdockerflaskflask-restfulflask-restplusmicroservicepdfpdf2imagepdf2textrest-api

Isaccseven/pdf2text

Extract text from pdf using ocr

Python10Updated 4 years ago

ocrpdf2textpypdfpytesseractpythonrichtyper

sahil352005/ChatWithPdf-Images

A Streamlit-based app that allows users to upload PDFs or images, extract text, and engage in interactive Q&A. Using Google Generative AI, this app enables insightful conversations based on document contents. Ideal for those seeking quick answers from their files in a simple, intuitive interface.

Python10Updated 1 year ago

chat-applicationchatbotgemini-apiocrpdf2textpillowpytesseractpythonstreamlittextextracting

seinecle/nocodefunctions-io

io for nocodefunctions: csv, txt, pdf, and xlsx so far

Java10Updated 1 week ago

csv-parserparserspdf-parserpdf-to-textpdf2textxlsx-parser

zhangshi0512/DevTools

A lightweight Python-based Software Package for daily use

Python10Updated 6 months ago

image-processinglocalpdf2textragretrieval-augmented-generation

NikhilTeja21/Audio-Books

This project converts PDF files into audiobooks with synchronized subtitles in .vtt format. It uses FastAPI for the backend and Microsoft's Edge TTS for text-to-speech conversion.

Python00Updated 10 months ago

edge-ttspdf2texttts

fer-aguirre/pdf-2-ner

Web application for information extraction and named entity recognition for PDF files (work-in-progress).

Jupyter Notebook00Updated 3 years ago

named-entity-recognitionnlppdf2textstreamlittext-analysis

davibusanello/pdf2txt

A simple CLI to to convert PDF files into TXT using OCR

Python00Updated 1 year ago

clicli-appmit-licensepdf-converterpdf-to-textpdf2textpdf2txtpythonpython3tesseract-ocr

SeeligA/OCRstream

Building an OCR pipeline for PDF to TXT

Python00Updated 6 years ago

ocr-processingpdf2text

Nath9666/Lexo

Outil OCR permettant d’extraire et de structurer du texte à partir d’images et de PDF scannés (export en .docx et .txt) — prise en charge du français et de l’anglais

Python00Updated 1 month ago

desktop-appdocument-conversiondocxdrag-and-dropguiimage-processingloggingmultilingualocrpdf2imagepdf2textpythonpython-docxscanned-pdftesseract-ocrtext-extractiontxt

ChrisCraddock/DC-Advanced-Walkthrough

Data Center Advanced Walkthrough. Insert data from a PDF file into MySQL database

Python00Updated 4 years ago

databasedatacenterdequemysqlpdf2textphpmyadmin-databasepythonpython-scriptsql