Product Spotlight: Workflows + Human-in-the-Loop

In this article

Product Spotlight: Workflows + Human-in-the-Loop

Kushal Byatnal

Co-founder, CEO

May 22, 2025

At Extend, we built Workflows to give teams a fast, reliable way to manage end-to-end document ingestion pipelines — from raw PDFs to structured, validated, production-ready data.

Whether you're classifying thousands of vendor forms, extracting fields from multi-page contracts, or applying custom validation rules, Workflows let you orchestrate everything through a single API call — with built-in logic to catch edge cases, escalate issues, and improve accuracy over time.

Production-Ready Pipelines

Configuring an extractor is just step one. To safely ship into production, you need a control plane to handle all the messy parts of real-world documents:

Chain together multiple processing modes to handle real-world document variation
Validate outputs against your own business rules or systems
Catch low-confidence results before they hit downstream systems
Escalate failures for human review — and feed those corrections back into your models

A typical Workflow can include:

Classification and Splitting: Route documents to the right processor based on type, or split documents semantically to isolate relevant sections.
Extraction: Pull structured data using schema-driven extractors — with support for nested fields, arrays, and multimodal elements like signatures, handwriting, and figures.
Validation: Apply custom rules to enforce business logic or cross-check values against your own systems (e.g., “does this vendor ID exist in our database?”).
Human Review: Automatically flag low-confidence fields or failed validations, and route them to human operators for verification or correction.

All of this is coordinated behind a single runWorkflow API call. Instead of wiring up multiple endpoints, Workflows let you submit a file, and get back a complete, reviewed output.

Mapping a Complex Flow

Let’s say you're a business lending platform. When onboarding a new customer, they submit a single PDF with dozens of pages — purchase contracts, tax forms, bank statements.

To extract the data you need and ensure it’s correct, you need to:

Split the PDF into sub-documents (contracts, forms, certificates)
Route each subdocument to the right Extractor for schema-specific parsing
Validate critical fields (e.g. if the totals match line items)
Escalate documents that fail validation and send to human review

This kind of logic used to take teams weeks to months to build and maintain. With Workflows, it’s supported natively, and comes with version-control, monitoring, and observability on day one.

Human in the Loop Review

No matter how advanced the model, 100% accuracy isn’t guaranteed. Fuzzy visuals, ambiguous data, and model errors can lead to incorrect outputs — and serious downstream consequences.

Extend includes built-in human-in-the-loop (HITL) tooling to catch and correct these issues.

You can configure review triggers at any step:

Confidence thresholds (e.g. flag if total_amount < 0.95 confidence)
Validation failures (e.g. line item totals don’t add up)
External system checks (e.g. customer ID not found in your database)
Unexpected doc types (e.g. customer uploads an invalid document type)

Flagged documents are routed to Extend’s built-in Review UI, where your team members can:

Edit any extracted value
Reclassify documents
Approve or reject the run
Feed corrections into evaluation sets

It’s more than just a safety net — it’s a tight feedback loop that improves your models over time.

From Human Review → Full Automation

When launching a mission-critical use case, teams enable human review to catch early issues and accelerate iteration. Over time, as models and configs improve, review needs drop dramatically.

HomeLight, for example, initially reviewed nearly every document. But after a month of near-perfect accuracy and no corrections, they fully removed human-in-the-loop.

Looking Ahead

We’re continuing to invest in making Workflows even smarter:

Memory System: Leverage past corrections and layouts to improve model behavior over time
Review Agent: Use an AI reviewer to automatically decide if an output is validated, or should be flagged for a second set of eyes.

If You’re Building Document Automation

And want to get to high accuracy without building your own validation and review pipeline — let’s talk.

In this article

WHY EXTEND?