DocLD-FinTabNet: Leading Table Extraction on Financial Documents

Financial documents are table-heavy. SEC filings, annual reports, and 10-Ks pack dense tables with multi-level headers, merged cells, and numeric data that must be extracted accurately for downstream analysis. FinTabNet — 112K+ tables from S&P 500 company annual reports — is the standard benchmark for table extraction on financial documents.
We ran DocLD's agentic table extraction on a 1,000-table sample from FinTabNet, using the same Needleman-Wunsch scoring methodology as RD-TableBench, and compared against published academic baselines. The results demonstrate DocLD's strength on financial table extraction.
Why FinTabNet
FinTabNet was introduced by IBM Research in the Global Table Extractor (GTE) paper. It contains 89,646 pages with 112,887 tables from S&P 500 annual reports. Cell structure labels were generated through token matching between PDF and HTML versions — programmatic but high-quality for the financial domain.
Financial tables have distinct properties: diverse styles, fewer graphical lines, larger gaps, and more color variation than scientific documents. FinTabNet tests how well extraction tools handle real-world financial table layouts.
We used FinTabNet_OTSL — a conversion with corrected annotations and OTSL structure format — sampling 1,000 tables from the test split.
The Results

DocLD achieves ~90% average table accuracy on the FinTabNet sample. Academic baselines (GTE, TATR) use different evaluation setups and metrics — GTE reports cell structure recognition improvements; TATR reports ICDAR-2013 scores when trained on FinTabNet. Direct comparison is limited, but DocLD's strong performance on financial tables is clear.
Score Distribution

The per-sample score distribution shows DocLD's consistency: the majority of tables score above 85%, with a tight interquartile range. Financial tables with merged headers, subtotals, and dense numeric grids are handled reliably.
Evaluation Methodology
We used the same scoring as RD-TableBench:
- Needleman-Wunsch hierarchical alignment (cell-level + row-level)
- Parameters:
S_ROW_MATCH = 5,G_ROW = -3,S_CELL_MATCH = 1,P_CELL_MISMATCH = -1,G_COL = -1 - Cell normalization: strip whitespace, newlines, hyphens
- HTML → 2D array: expand rowspan/colspan for alignment
DocLD was run with agenticTables: true (agentic extraction mode) and HTML table output.
Reproduce Our Results
Our evaluation code is open-source at github.com/Doc-LD/fintabnet-bench.
What This Means for Financial Documents
If you process SEC filings, annual reports, or financial statements:
- Merged headers and subtotals — DocLD's agentic extraction preserves structure with explicit
rowspan/colspan. - Dense numeric tables — VLM-based extraction handles thin borders and small text.
- Consistent output — HTML format with proper structure for downstream pipelines.
Data Sources and References
| Resource | Link |
|---|---|
| DocLD evaluation code | github.com/Doc-LD/fintabnet-bench |
| FinTabNet_OTSL dataset | HuggingFace |
| GTE paper (FinTabNet origin) | arXiv:2005.00589 |
| Aligning benchmark datasets (TATR) | arXiv:2303.00716 |
| RD-TableBench methodology | DocLD-TableBench blog |
Frequently Asked Questions
We ran DocLD's agentic table extraction against 1,000 table images from the FinTabNet_OTSL test split and scored results using the same Needleman-Wunsch grading as RD-TableBench. DocLD was invoked with agenticTables: true and HTML output.
GTE and TATR are the main academic baselines for FinTabNet. They use different metrics (cell structure recognition, ICDAR-2013 exact match) and evaluation setups. We report our Needleman-Wunsch scores for consistency with RD-TableBench; the comparison is indicative rather than apples-to-apples.
Yes. Clone fintabnet-bench, run npm run download, npm run extract (with DOCLD_API_KEY), and npm run score. Results are written as JSON with per-sample scores.
Agentic table extraction takes approximately 3 seconds per table image. For 1,000 tables, expect ~50 minutes. Use concurrency for faster runs.