"topic:document-ai" — Search

AI-powered StartUp Accelerator Engine built with Next.js, LangChain, PostgreSQL + pgvector. Upload, organize, and chat with documents. Includes predictive missing-document detection, role-based workflows, and page-level insight extraction.

JavaScript788111Updated 2 days ago

ai-chatbotdocument-aidocument-analysisdrizzle-ormfull-stacklangchainllm-appnextjsocropenaipgvectorpostgresqlragrag-chatbottypescriptvector-search

jpWang/LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)

Python36241Updated 1 week ago

document-aidocument-analysisdocument-understandinginformation-extractionmultilingual-modelsmultimodal-pre-trained-modelnlp

SCUT-DLVCLab/Document-AI-Recommendations

Algorithms, papers, datasets, performance comparisons for Document AI.

2059Updated 1 day ago

document-aidocument-understandingkey-information-extractiontable-structure-recognitionvisual-information-extraction

harumiWeb/exstruct

Conversion from Excel to structured JSON (tables, shapes, charts) for LLM/RAG pipelines, and autonomous Excel reading and writing by AI agents through MCP integration.

Python12920Updated 3 hours ago

data-exdocument-aiexcelexcel-automationexcel-parsingllmmcp-serverpython-libraryragstructured-dataxlwings

doc-analysis/ReadingBank

ReadingBank: A Benchmark Dataset for Reading Order Detection

1174Updated 2 weeks ago

document-aidocument-intelligencedocument-understandingnatural-language-processingnlpocr

clovaai/webvicob

Official Implementation of Web-based Visual Corpus Builder (Webvicob), ICDAR 2023

Python1098Updated 5 months ago

document-aiicdar2023nlpocr

nttmdlab-nlp/SlideVQA

SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)

Python1058Updated 2 weeks ago

aaai2023computer-visiondocument-ainlpocr

PSPDFKit/ai-assistant-demo

AI Document Assistant for PSPDFKit Demo showcases how to interact with PDFs using natural language commands powered by AI, integrated with PSPDFKit for Web.

JavaScript644Updated 1 week ago

aiai-assistantchatdocument-aidocument-processingllmnatural-languagenutrientpdfpspdfkitweb-sdk

PSPDFKit/nutrient-dws-mcp-server

A Model Context Protocol (MCP) server implementation that integrates with the Nutrient Document Web Service (DWS) Processor API, providing powerful PDF processing capabilities for AI assistants.

TypeScript635Updated 1 week ago

ai-agentsclaudedocument-aidocument-processinglangchainllmmcpmcp-servermodel-context-protocolnutrientopenaipdfpdf-processing

nttmdlab-nlp/VDocRAG

[CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents

Python615Updated 3 days ago

computer-visioncvpr2025document-ainlpocr

ZeningLin/ViBERTgrid-PyTorch

An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents. ICDAR, 2021"

Python535Updated 11 months ago

document-aidocument-analysisinformation-extractionkey-information-extractionvisual-information-extraction

googleapis/python-documentai-toolboxArchived

This library has moved to https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-documentai-toolbox

Python5322Updated 6 days ago

aidocument-aigcpgenerative-aigoogle-cloudgoogle-cloud-platformvertex-ai

whn09/table_structure_recognition

Table detection (TD) and table structure recognition (TSR) using Yolov5/Yolov8, and you can get the same (even better) result compared with Table Transformer (TATR) with smaller models.

Jupyter Notebook5118Updated 6 months ago

document-aiocrtabletable-detectiontable-structure-recognitionyolov5yolov8

DunnBC22/Vision_Audio_and_Multimodal_Projects

This repository includes all computer vision, audio, document AI, and multimodal projects.

Jupyter Notebook5112Updated 1 month ago

audio-classificationcomputer-visiondocument-aimultimodal-deep-learningobject-detectionoptical-character-recognitiontransfer-learningtransformers

I3K-IT/RAG-Enterprise

🚀 100% local RAG system with one-command setup. Your data never leaves your server. AGPL-3.0

Python437Updated 1 day ago

dockerdocument-aienterprisefastapigpulangchainllmlocal-aiocrollamaprivacyqdrantragreactself-hosted

ZeningLin/PEneo

[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.

Python416Updated 1 month ago

document-aidocument-understandingkey-information-extractionocrvisual-information-extraction

qyhou/curated-table-structure-recognition

A curated list of resources on Table Structure Recognition

332Updated 1 day ago

document-aidocument-intelligencetable-recognitiontable-structure-recognition

OpenDCAI/Flash-MinerU

Ray-based accelerator for MinerU VLM inference pipeline. Lightweight, multi-GPU friendly PDF → Markdown processing. 基于 Ray 的 MinerU VLM 推理加速器，轻量、低侵入，面向多 GPU / 国产算力环境的 PDF → Markdown 处理方案。

Python333Updated 2 days ago

distributed-computingdocument-aillm-inferenceminerumulti-gpuparallel-computingpdfpdf-parsingpdf2markdownray

Unstructured-IO/communityArchived

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

297Updated 6 months ago

communitydata-pipelinedeep-learningdocument-aidocument-parsingmachine-learningnlp-parsingocr-pythonopen-sourcepreprocessing-data

seehiong/voicedoc-agent

🎙️ Voice-native document intelligence using Gemini, ElevenLabs STT/TTS, and Datadog observability — turning text documents into spoken conversations.

TypeScript252Updated 2 weeks ago

agent-architectureai-observabilitycloud-rundatadogdocument-aielevenlabsexpressive-voicegeminigoogle-cloudhackathonllm-agentsllm-monitoringnextjsobservabilityragspeech-to-texttext-to-speechvertex-aivoice-aivoice-first

Shulk97/daniel

This repository contain the implementation of DANIEL. (A fast Document Attention Network for Information Extraction and Labeling of handwritten documents)

Python211Updated 1 week ago

computer-visiondocument-aimultimodal-pre-trained-modelnlpocr

SCUT-DLVCLab/RFUND

[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"

200Updated 7 months ago

document-aidocument-understandingkey-information-extractionocrvisual-information-extraction

chenxn2020/GOSE

[Paper] Code for the EMNLP2023 (Findings) paper "Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document"

Python171Updated 1 year ago

document-airelation-extraction

conditionedstimulus/DocumentClassifier

FastAPI application for document classification using a multimodal LayoutLM model, designed to classify PDF documents into RVL-DCIP categories.

Jupyter Notebook120Updated 2 months ago

document-aifastapilayoutlmv3machine-learningnlppython

NirmalNagaraj/DocGPT

A Chatbot for the Document Analysis .

Python120Updated 4 months ago

aichatbotdocument-ai

Page 1 of 6