The first version of Extend's API was built in a weekend. It wasn't perfect but it got the job done. As new feature requests came in, we added functionality on top of it iteratively and released new API versions every time we needed to make breaking changes. Meanwhile, our user base shifted. Early on, many of our users were operations teams configuring things through our dashboard. As we leaned into developers as our primary audience, the bar got much higher: they expected a predictable, well-typed API they could integrate with in minutes, and we realized we needed to make some big changes to hit that bar.
Here are the problems we faced and how we fixed them.
Problem: Polymorphic endpoints weren't type-safe
The /processor_runs and /processors endpoints were some of our most important endpoints because they were the primary way customers would extract, classify, or split documents without using workflows.
Here's what calling the /processor_runs endpoint on our old API looked like:
const run = await client.processorRun.create({
processorId: "dp_abc123",
file: { fileUrl: "https://..." },
sync: true,
});
if (run.processorRun.type === "EXTRACT") {
const output = run.processorRun.output as ExtractOutput; // <-- type casting
}That as ExtractOutput cast is the problem. The endpoint returns a document_processor_run object with two separate fields: type, which could be EXTRACT, CLASSIFY, or SPLIT, and output, which could be ExtractOutput, ClassifyOutput, or SplitOutput. But the schema defined these as independent fields with no relationship between them, so TypeScript had no way to know that type: "EXTRACT" meant output was an ExtractOutput. You'd check the type and still have to cast the output manually. TypeScript does handle discriminated unions well when they're structured correctly; however, most other languages don't handle this well.
This smell was a side effect of an early design choice: one generic Processor endpoint for everything. This was convenient for our internal data models and allowed us to move fast, but it was a leaky abstraction that created a confusing and suboptimal developer experience.
In addition, having these endpoints required customers to learn the "processors" concept when really all they wanted to do was extract, classify, or split. This was confusing, especially for new users already waterboarded by context when starting a new tool.
Fix: Dedicated endpoints per operation
The "processors" abstraction was an internal concept that had leaked into our public API. Customers didn't think in terms of processors. They wanted to extract data from a document, classify a document, or split a document. The endpoint names should have reflected that from the start.
We split /processors and /processor_runs into dedicated resource families:
| Resource | Old Endpoints | New Endpoints |
|---|---|---|
| Extraction | /processors + /processor_runs | /extractors + /extract_runs + /extract |
| Classification | /processors + /processor_runs | /classifiers + /classify_runs + /classify |
| Splitting | /processors + /processor_runs | /splitters + /split_runs + /split |
Each resource family follows the same pattern. POST /extract_runs creates an extract run, GET /extract_runs/{id} fetches one, GET /extract_runs lists them. A developer who knows the resource name can guess every endpoint. The naming is now self-documenting: you don't need to learn what a "processor" is before you can use the API.
And because each endpoint maps to a single resource type, responses are fully typed. No more casting:
const run = await client.extractRuns.create({
extractor: { id: "ex_abc123" },
file: { url: "https://..." },
});
console.log(run.output?.value); // Typed as ExtractOutputProblem: Sync and async shared the same endpoint
Our /processor_runs endpoint supports both synchronous and asynchronous execution via a sync flag. We originally only offered async where you'd create a run, get back an ID, and poll or set up a webhook to get results:
const run = await client.processorRun.create({
processorId: "dp_abc123",
file: { fileUrl: "https://..." },
});
// Poll until done
let result = await client.processorRun.get(run.id);
while (result.status === "PROCESSING") {
await sleep(1000);
result = await client.processorRun.get(run.id);
}That's the right pattern for production, but it made it impossible to just fire off a curl request and see what the output looked like. Every new customer had to write a polling script before they could even evaluate the product. So we added sync: true to the same endpoint, which blocks until the run completes and returns the result inline.
This fixed onboarding but introduced its own problems. The response shape changes depending on whether sync is set, which is confusing and hard to document cleanly. Customers see the flag and assume it's the recommended approach for production use. Also, because sync was meant for testing, we wanted to be able to enforce much lower rate limits on that behavior. But because the sync mode shared an endpoint with the async mode, we'd have to have different rate limits depending on the parameters passed to the endpoint, which would be confusing and unintuitive for developers.
Fix: Dedicated sync endpoints
Early in the process, we had decided on some principles, two of which being: endpoints should reflect the resources they create, and should be async — never blocking on long-running operations. The async rule exists for good reasons like connection exhaustion and unpredictable client timeouts.
But we kept getting the same feedback: "I just want to send a document and get a result. Why do I need webhooks to try your product?" New customers evaluating Extend, developers building prototypes, and small teams processing a few documents a day all wanted a simpler path. The async pattern is right for production at scale, but it's a bad first experience.
Rather than bolt another flag onto an existing endpoint like we did before, we added dedicated POST /extract, POST /classify, POST /split endpoints. These are separate sync endpoints that block until completion:
curl -X POST https://api.extend.ai/extract \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "x-extend-api-version: 2026-02-09" \
-H "Content-Type: application/json" \
-d '{"config": {"schema": {"type": "object", "properties": {"vendor": {"type": ["string", "null"]}}}}, "file": {"url": "https://example.com/invoice.pdf"}}'Because these are their own endpoints, we can rate limit them independently, document them separately, and make it clear in the docs that they're for testing and onboarding, not production. For production, use the async *_runs endpoints with webhooks or the SDK's createAndPoll. But the on-ramp matters. A developer's first 5 minutes with your API determines whether they stick around.
Principles are defaults, not hard constraints. When they conflict with what customers need, the customer wins.
Problem: Our API shape was inconsistent and unpredictable
We had no shared philosophy or guidelines for how to design an endpoint. Each engineer made reasonable decisions in isolation, but the API drifted over time. Response shapes varied between endpoints, naming conventions were inconsistent, and customers couldn't predict what a new endpoint would look like based on existing ones.
For example, fields like failureReason, reviewedBy, startTime, and endTime on a workflow run were optional, meaning they'd be absent from the response entirely when they had no value. The shape of the response changed depending on the state of the resource:
// A workflow run that just started — 9 keys
{ "id": "wr_xxx", "status": "PROCESSING", "reviewed": false, "files": [...] }
// The same workflow run after failure — 11 keys
{ "id": "wr_xxx", "status": "FAILED", "reviewed": false, "files": [...],
"failureReason": "CORRUPT_FILE", "failureMessage": "Unable to parse file" }
// The same workflow run after review — 13 keys
{ "id": "wr_xxx", "status": "PROCESSED", "reviewed": true, "files": [...],
"reviewedBy": "jane@example.com", "reviewedAt": "2025-03-21T16:45:00Z",
"startTime": "2025-03-21T15:30:00Z", "endTime": "2025-03-21T15:35:00Z" }A developer couldn't look at one response and know what fields the resource supported. Were fields missing because the run wasn't in the right state? Because of permissions? Because of a deprecated field? There was no way to tell.
Naming was inconsistent too. The full workflow run object returned from GET /workflow_runs/:id called the reviewer field reviewedBy. The summary returned from the list endpoint GET /workflow_runs called the same field reviewedByUser:
// GET /workflow_runs/:id
{ "id": "wr_xxx", "object": "workflow_run", "reviewedBy": "jane@example.com" }
// GET /workflow_runs
{ "id": "wr_xxx", "object": "workflow_run", "reviewedByUser": "jane@example.com" }Same resource, same data, different field name depending on which endpoint you called. A developer who wrote code against the get endpoint would find it silently broken when they switched to the list endpoint.
Internally, this also meant every new public endpoint required rethinking decisions that should have already been made: how to name fields, how to structure responses, whether to nest or flatten. The lack of shared conventions turned every endpoint into a design exercise from scratch, slowing development.
Fix: Rules for consistent response shapes
The inconsistency problem wasn't just about naming conventions. Response payloads varied in structure across endpoints, and the rules for which fields were present changed depending on the state of a resource. We established two rules that brought consistency to every response in the API.
Every resource gets two shapes
Extraction outputs are big: nested JSON with every extracted field, metadata, confidence scores, and citations. A single run's output can be tens of kilobytes. A paginated list of 50 would blow past reasonable HTTP payload sizes and cause real memory pressure on the client during deserialization.
We considered Stripe-style expand[] parameters, but the payload size for a resource can be huge since they depend on the amount of data extracted. So it would be dangerous to allow customers to expand a parse that could have megabytes of data in it. Instead, we decided that every resource should have exactly two shapes: a Full object (from GET /{resource}/{id}) and a Summary (for list endpoints and embedded references). If you need the full output, call the GET endpoint. This pattern makes the API much more predictable for our customers and much easier for us as a team to decide what data should be returned for different endpoints.
// List response: Summary, no output
{ "object": "extract_run", "id": "exr_xxx", "status": "PROCESSED" }
// Get response: Full, includes output
{ "object": "extract_run", "id": "exr_xxx", "status": "PROCESSED", "output": { ... } }Required but nullable
When an extract run is processing, output has no value yet. You can represent that as optional (absent from the response) or required but nullable (always present, set to null). We chose nullable.
Consistent shape, not conditional shape. Every response includes the same keys regardless of status. A developer reading a single API response can discover every field the resource supports, and a null value is unambiguous: the field exists, you have access to it, it just has no value right now.
{ "status": "PROCESSING", "output": null, "failureReason": null }
{ "status": "PROCESSED", "output": { "value": { ... } }, "failureReason": null }
{ "status": "FAILED", "output": null, "failureReason": "CORRUPT_FILE" }Same keys, every time, regardless of status.
Cross-language SDK consistency. Our SDKs are generated from a single OpenAPI spec. When that spec marks a field as optional, the codegen has to express "this field might not be in the response" which produces different type signatures in every language (T | undefined | null in TypeScript, *T with omitempty in Go, Optional<T> or bare T with setter tracking in Java). Required-but-nullable collapses all of that to one concept: the field is always there, check if it's null.
For developers using the SDK, the practical difference in application code is small because most languages already treat "missing" and "null" the same way. But it makes the documentation unambiguous. We can write "check if output is null" and that's the complete, correct instruction in every language. No caveats, no "in Go you'll need to..." footnotes.
Takeaways
- Internal abstractions shouldn't leak into public interfaces. The "processors" concept made sense in our codebase but added unnecessary cognitive load for customers. Name things after what customers are trying to do, not how your system models it internally.
- Principles are defaults, not constraints. We had a principle that endpoints should never block on long-running operations and it was a good one. But we broke it deliberately to fix the onboarding experience. The customer needs come first.
- Consistency compounds. The individual decisions — nullable over optional, two shapes per resource, predictable naming — are each small. But together they meant developers could predict how a new endpoint would behave before reading the docs.
- Don't optimize past the point where it matters to the customer. You can debate nullable vs. optional forever. At some point, pick one, apply it everywhere, and move on.
As developers, we only get a handful of opportunities to design an API in our careers, if we're lucky. Hopefully this makes it easier the next time you get to do it.

