Table Extraction
ProcessingTable extraction is identifying and pulling tabular data from documents into structured rows and columns. DocLD’s parsing detects tables via layout analysis and outputs them in a structured form so they can be chunked as units or extracted into fields.
Use Cases
- RAG — Tables are kept together in chunks so vector search and citation return coherent table content.
- Structured extraction — Extraction schemas can define fields that map to table cells (e.g., line items, totals).
- Export — Table data can be exported to CSV or used in downstream systems.
Table extraction works on native PDF and, where layout is detected, on OCR output. Quality depends on document clarity and layout analysis.
Related Concepts
Table extraction is part of parsing and layout analysis. It feeds chunking and extraction; output is often structured data or schema-based fields.