"topic:document-understanding" — Search

68 results for “topic:document-understanding”

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

Python74.9k8.4kUpdated just now

agentagenticagentic-aiagentic-workflowaiai-searchcontext-engineeringcontext-retrievaldeep-researchdeepseekdeepseek-r1document-parserdocument-understandinggraphragllmmcpollamaopenairagretrieval-augmented-generation

deepdoctection/deepdoctection

A Repo For Document AI

Python3.1k188Updated 12 hours ago

document-aidocument-image-analysisdocument-layout-analysisdocument-parserdocument-understandinglayoutlmnlpocrpublaynetpubtabnetpythonpytorchtable-detectiontable-recognitiontensorflow

X-

X-PLUG/mPLUG-DocOwl

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python2.4k146Updated 3 days ago

chart-understandingdocument-understandingmllmmultimodalmultimodal-large-language-modelstable-understanding

AlibabaResearch/AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

C++1.8k199Updated 1 day ago

artificial-intelligencecomputer-visiondocumentdocument-analysisdocument-intelligencedocument-recognitiondocument-understandingdocumentaiend-to-end-ocrmultimodalmultimodal-deep-learningocrscene-text-detectionscene-text-detection-recognitionscene-text-recognitiontext-detectiontext-recognitionvision-languagevision-language-modelvision-language-transformer

tstanislawek/awesome-document-understanding

A curated list of resources for Document Understanding (DU) topic

1.5k166Updated 1 day ago

awesomeawesome-listdeep-learningdocument-aidocument-analysisdocument-intelligencedocument-layout-analysisdocument-understandinginformation-extractionintelligent-processingkey-information-extractionmachine-learningnatural-language-processingnlpocrpdfpdf-documentsrobotic-process-automationrpaunstructured-data

OpenBMB/VisRAG

Parsing-free RAG supported by VLMs

Python93171Updated 1 hour ago

document-retrievaldocument-understandingmulti-modalmulti-modalityragretrievalretrieval-augmented-generationvision-language-model

wenwenyu/PICK-pytorch

Code for the paper "PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks" (ICPR 2020)

Python570191Updated 1 week ago

document-analysisdocument-understandinggraph-convolutional-networkgraph-learninggraph-neural-networkskey-information-extraction

jpWang/LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)

Python36241Updated 1 week ago

document-aidocument-analysisdocument-understandinginformation-extractionmultilingual-modelsmultimodal-pre-trained-modelnlp

GoogleCloudPlatform/document-ai-samples

Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud

Jupyter Notebook310115Updated 3 days ago

document-understandingmachine-learningocrpdfpythonsamples

Tan-Junwen/awesome-table-structure-recognition

A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.

22512Updated 4 weeks ago

document-understandingtable-detectiontable-extractiontable-functional-analysistable-structure-recognition

SCUT-DLVCLab/Document-AI-Recommendations

Algorithms, papers, datasets, performance comparisons for Document AI.

2059Updated 1 day ago

document-aidocument-understandingkey-information-extractiontable-structure-recognitionvisual-information-extraction

huggingface/chug

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

Python16110Updated 2 weeks ago

computer-visiondataloadingdatasetsdistributed-trainingdocument-understandingmulti-modal-learningpdf-documentwebdataset

Alpha-Innovator/DocGenome

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models

Jupyter Notebook1527Updated 1 month ago

document-understandingpaper-annotationquestion-answering

andreagemelli/doc2graph

Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.

Jupyter Notebook13725Updated 2 weeks ago

deep-learningdocument-understandinggeometric-deep-learninggnnkey-information-extractionlayout-analysisnlppytorchtable-detection

doc-analysis/ReadingBank

ReadingBank: A Benchmark Dataset for Reading Order Detection

1174Updated 2 weeks ago

document-aidocument-intelligencedocument-understandingnatural-language-processingnlpocr

LynnHaDo/Document-Layout-Analysis

Object Detection Model for Scanned Documents

Jupyter Notebook9414Updated 1 month ago

document-understandingobject-detectionpythonyolov8

LynnHaDo/Checkbox-Detection

Checkbox Detection Model for Scanned Documents

Jupyter Notebook926Updated 3 weeks ago

computer-visioncopy-pastedeep-learningdocument-understandingobject-detectionpythonyolov8

athrael-soju/Snappy

🐊 Snappy's unique approach unifies vision-language late interaction with structured OCR for region-level knowledge retrieval. Like the project? Drop a star! ⭐

Python8315Updated 2 days ago

colpalicomputer-visiondeepseek-ocrdockerdocument-retrievaldocument-understandingfastapimultimodal-aimultivector-searchnextjspdf-searchpythonqdrantragtypescriptvector-databasevector-searchvision-aivisual-retrieval

microsoft/CompHRDoc

Datasets and Evaluation Scripts for CompHRDoc

Python579Updated 3 days ago

document-structure-analysisdocument-understandingrag-related

3DCF-Labs/doc2dataset

3DCF / doc2dataset: token-efficient document layer with NumGuard numeric integrity and multi-framework exports for RAG & fine-tuning.

Rust565Updated 1 week ago

3dcfclidata-pipelinedataset-generationdoc2datasetdocument-processingdocument-understandingevaluationfine-tuningllmmachine-learningnlpnumguardocrragrust

ZeningLin/PEneo

[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.

Python416Updated 1 month ago

document-aidocument-understandingkey-information-extractionocrvisual-information-extraction

docling-project/docling4j

Docling4j brings the functionalities of Docling in document understanding to Java® projects

Java265Updated 2 weeks ago

aidoclingdocument-parserdocument-parsingdocument-understandingdocumentsjavapdfpdf-converterpdf-to-json

NExTplusplus/TAT-DQA

TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning

231Updated 1 month ago

document-understandingquestion-answeringvqa

SCUT-DLVCLab/RFUND

[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking for End-to-end Document Pair Extraction"

200Updated 7 months ago

document-aidocument-understandingkey-information-extractionocrvisual-information-extraction

uakarsh/TiLT-Implementation

Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.

Jupyter Notebook180Updated 5 months ago

deep-learningdocument-understandingpytorch-implementationpytorch-lightningtransformers

athrael-soju/little-scripts

A monorepo containing various utility scripts, tools, and applications for development, automation, and AI-powered tasks.

Python140Updated 1 week ago

colpalicomputer-visioncudadeepseek-ocrdocument-understandingfastapiflash-attentiongradioocrpaddle-ocrqdrantrag-chatbotspeech-to-texttext-to-speechvector-search

jacobmarks/pytesseract-ocr-plugin

Run optical character recognition with PyTesseract from the FiftyOne App!

Python110Updated 8 months ago

computer-visiondocument-understandingfiftyonenlpocrpluginpythontesseracttesseract-ocr

Haruhiyuki/yuque-rag

将语雀知识库接入大语言模型，实现基于 RAG（检索增强生成）的智能问答系统，支持FastAPI，兼容OpenAI API与本地Ollama模型。

Python101Updated 1 week ago

ai-searchdocument-understandingrag

VLR-CVC/DocVQA2026

Official evaluation scripts and baseline prompts for the DocVQA 2026 (ICDAR 2026) Competition on Multimodal Reasoning over Documents.

Python91Updated 12 hours ago

competitiondocument-understandingmultimodal-datasetsvqa-dataset

yuvaraj-kannan/preocr

Fast document classification and OCR detection. Analyzes any file type to determine if OCR is needed, saving time and money on unnecessary processing.

Python73Updated 2 weeks ago

computer-visiondocument-analysisdocument-classificationdocument-intelligencedocument-processingdocument-understandingfile-analysisimage-processinglayout-analysisocrocr-detectionopencvpdfpdf-analysispdf-parsingpreprocessingpythonpython-librarytext-detectiontext-extraction

Page 1 of 3