docling-project/DocLayNet
Updated • 748 • 138
More details refer to Github
You know that RAG is very popular these days. There are many applications that support talking to documents. However, there is a huge performance drop when talking to a complex document due to the complex structures. So it's a challenge to extract content from complex document and organize it into parsable form. This repo aims to solve this challenge with a fast and good performance method.
YOLO is the most advenced detect model developed by Ultralytics. YOLO
has 5 different sizes of base model and a super powerful framework for training and deployment. So I chose YOLO to
solve this challenge.DocLayNet is a human-annotated document layout segmentation dataset containing 80863 pages from a broad variety of
document sources. As far as I know, it's the most qualified document layout analysis dataset.from ultralytics import YOLO
model = YOLO("{path to model file}")
pred = model("{path to test image}")
print(pred)
DocLayNet can be found more details and download at this link. It has 11 labels: