Blog

How we're building DocLD, why we're here, and where we're headed.

DocLD on FUNSD: Form Understanding Performance Analysis

We ran DocLD's document parsing on the official FUNSD testing set — 50 noisy scanned form images with ground-truth annotations. Here's a performance analysis with visualizations and insights into OCR quality on real-world forms.

Mar 8, 2026Tejaswi Suresh · Founder

Build vs. Buy for Document Processing: Choosing the Right Approach

A practical framework for deciding when to build document processing in-house versus using an API or platform — volume, complexity, compliance, and total cost.

Mar 7, 2026Tejaswi Suresh · Founder

DocLD-FinTabNet: Leading Table Extraction on Financial Documents

We benchmarked DocLD's table extraction on FinTabNet — 500 financial tables from S&P 500 SEC filings — scoring 82.1% accuracy with zero failures and outperforming GTE (IBM) and TATR (Microsoft). Here's what we found.

Mar 3, 2026Tejaswi Suresh · Founder

DocLD-TableBench: How We Stack Up Against the Best in Table Extraction

We ran DocLD against Reducto's open RD-TableBench dataset — 1,000 PhD-annotated complex tables — and compared accuracy with Reducto, Azure, Textract, GPT-4o, and more. Here's what we found.

Feb 27, 2026Tejaswi Suresh · Founder

Extract Like a Pro — How DocLD Handles Your Messiest Documents

DocLD's intelligent extraction doesn't just read documents—it understands them. See how we turn complex layouts, tables, and multi-page forms into clean, structured data.

Feb 27, 2025Tejaswi Suresh · Founder

Structured Extraction in DocLD — Schemas, Jobs, and Corrections

How DocLD turns documents into structured data: defining schemas, running extraction jobs, and fixing results with the correction UI and API.

Feb 19, 2025Tejaswi Suresh · Founder

Everything You Need to Know About PDFs

A technical deep dive into the Portable Document Format: specification, file structure, object model, text vs. images, fonts and encoding, parsing strategies, security, linearization, accessibility, and the tooling ecosystem.

Feb 16, 2025Tejaswi Suresh · Founder

Building DocLD — How We're Building It, Why We're Here, and Where We Stand

Our approach to building DocLD as an end-to-end document intelligence platform, why we're solving this problem, and how we're positioning against the competition.

Feb 2, 2025Tejaswi Suresh · Founder

How RAG Works in DocLD — Retrieval, Reranking, and Citations Under the Hood

A technical deep dive into DocLD's RAG pipeline: how Pinecone integrated embeddings power search, rerank chunks, and generate cited answers from your knowledge bases.

Feb 2, 2025Tejaswi Suresh · Founder