87 results for “topic:layout-analysis”
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
A Unified Toolkit for Deep Learning Based Document Image Analysis
An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.
Read and extract text and other content from PDFs in C# (port of PDFBox)
YomiTokuはAIを活用した日本語文書解析エンジンを提供するPythonパッケージです。 Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
OCR engine for all the languages
Document Layout Analysis resources repos for development with PdfPig.
A toolbox of ocr models and algorithms based on MindSpore
Analysis of Chinese and English layouts 中英文版面分析
📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。
YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
基于paddleOCR的nodejs库
An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"
A Unified Toolkit for Deep Learning-Based Table Extraction
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)
A keyboard layout that provides an elegant and balanced typing experience by its use of a thumb-alpha, emphasis on middle fingers, deprioritisation of pinkies, and arcane keys.
A Large Dataset of Historical Japanese Documents with Complex Layouts
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
Proof of concept of training a simple Region Classifier using PdfPig and ML.NET (LightGBM). The objective is to classify each text block in a pdf document page as either title, text, list, table and image.
利用java-yolov8实现版面检测(Chinese layout detection),java-yolov8 is used to detect the layout of Chinese document images
This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.
MinerU免安装部署一键启动整合包
A curated list of resources on Document Layout Analysis
An extensible web page segmentation and analysis framework.
pdfDet aims to simplify PDF layout detect tasks for users.
Open Dataset for the Recognition and Analysis of Scripts in Arabic Maghrebi (ICDAR 2021, CHR 2024)
Fast document classification and OCR detection. Analyzes any file type to determine if OCR is needed, saving time and money on unnecessary processing.