New Workbench v6.7 — pipelines for the agentic era

Turn every document
into trusted, structured data.

Blockdata turns your documents — and the databases around them — into blocks: atomic, cited, queryable units that your agents, your analysts, and your SQL all read the same way.

Start free Talk to sales

SOC 2 Type II HIPAA EU data residency

Assets

Runs

Schemas

Preview

Name Status Pages Conf.

PDF msa-acme-2026-final.pdf Indexed 24 0.98

DOC redline-vendor-agreement.docx Running 18 —

PDF insurance-claim-7842.pdf Indexed 9 0.94

IMG patient-intake-scan-04.tiff Indexed 3 0.91

PDF sec-10k-q3-2025.pdf Queued 142 —

MD deal-memo-blockdata.md Indexed 6 0.99

TXT earnings-transcript-q2.txt Indexed 11 0.96

PDF compliance-policy-v3.pdf Indexed 42 0.95

1 selected Parse Extract Classify Index ~78 cr

Trusted by document-heavy teams

Lattice Legal

Northwind

Helvetia

Caplan Bank

⟁ Orion Health

Meridian

Platform

A workbench, not a black box. Every block inspectable.

Documents and databases land in one stack of blocks — each with its source span, its confidence, and a re-runnable record of every decision the pipeline made. Spin out vectors, graphs, schemas, or agents from the same foundation.

01Parse

From any document to the same clean schema.

PDFs, scans, DOCX, slides, spreadsheets, emails. Layout-aware parsing with tables, headings, and figures preserved. Page-accurate provenance on every block.

→ parse

{ "title": "MSA", "effective": "2026‑03‑14", "parties": [ "Acme Inc.", "Blockdata" ], "pages": 24 }

02Extract

Structured fields, on your schema.

Define the columns you care about. We fill them, cite the source, and flag low confidence for review.

effective_date 2026‑03‑14

counterparty Acme Inc.

termination 90 days

03Classify

Route each document to the right pipeline.

Custom taxonomies. Few-shot or zero-shot. Confidence ranked.

MSA 0.96

SOW

NDA

Insurance

Filing

04Stack

One stack of blocks. Every shape you need.

A thousand documents, five million blocks, one consolidated stack. Spin out a vector store, a knowledge graph, a Postgres schema, or a Mongo collection — from the same source.

stack · 5.2M blocks

→

VECTORpgvector · pinecone

GRAPHknowledge graph

SQLpostgres schema

DOCmongodb

05Agents

Or ship a specialist agent. Powered by Kai.

Hand any slice of the stack to an agent that knows it cold. Orchestrated on Kai, our companion platform for agentic work — same auth, same audit trail.

CAContract Analyst412 MSAs

CFClaims Filer8.4k policies

RTRisk Triagelive

Kai agent platform →

Pipeline

One canvas. Six honest steps.

Every job in Blockdata follows the same six-step shape. Stop at any step, inspect outputs, retry with a tweaked prompt or schema, then re-run downstream — without losing what already worked.

Ingest

Folders, buckets, SharePoint, S3, GCS — plus Postgres, Mongo, and warehouse connectors.

+ 1,284 files queued + postgres://prod/contracts + gcs://contracts/2026

Parse

Layout-aware. Tables, figures, footnotes, signatures.

blocks: 1,284 → 18,712 tables: 412 avg p99: 4.2s

Classify

Route to MSA / NDA / claim / filing pipelines.

MSA: 412 · NDA: 188 claim: 642 · filing: 42 unsure: 12 → review

Extract

Your schema. Cited fields. Confidence on every value.

fields: 24 · cited: 100% avg conf: 0.94 flagged: 38 → review

Stack

5.2M blocks consolidated. Output to vector, graph, SQL, or Mongo.

→ pgvector · 312k → kg edges · 89k → postgres · msa.v3

Agents

Ship answers via API, dashboard, or a specialist agent on Kai.

/v1/query · /v1/stack kai://agents/contract p50 latency: 380ms

Solutions

Built for teams where documents are the work.

Pre-built schemas, pipelines, and review surfaces tuned for the four document worlds we hear about every week.

Legal

Contracts that read themselves.

MSAs, NDAs, vendor agreements. Pull effective dates, parties, governing law, termination, renewal — into a clause-level table you can query.

MSANDAredlines

Finance

10-Ks, term sheets, every footnote.

SEC filings, prospectuses, transcripts. Numerical extraction with page citations. Diff documents across quarters.

10-Kmemosterm sheets

Healthcare

Records to chart, with provenance.

Patient intake, prior auth, lab reports. HIPAA-aligned pipelines. Map to FHIR, drop into your EHR.

HIPAAFHIREHR

Insurance

Claims, faster — without the misses.

Adjuster notes, scans, policy docs. Triage at intake. Flag the 4% of cases that need a human.

claimspolicytriage

218M

documents parsed by Blockdata customers since launch.

0.96

median extraction confidence across legal, finance, healthcare.

11×

faster than the average in-house parse + extract stack.

380ms

p50 retrieval latency for agentic Q&A at scale.

Eleanor Chen

VP Engineering, Lattice Legal

$2.4M

saved in first-year review hours, across 84 attorneys.

We had four years of document AI that almost worked. Blockdata is the first tool where our partners actually trust the output, because every field is cited and every run is auditable. We migrated 11 pipelines off our in-house stack in a quarter.

Developers

Two SDKs, one truthful API. No surprises.

Python, TypeScript, and a REST surface that mirrors the workbench one-to-one. The same primitives your analysts click through, your engineers ship.

Build on the same primitives your team clicks.

Every workbench action — parse, extract, classify, index, query — maps to a single API call. Runs are addressable. Outputs are versioned. Re-running is a one-liner.

Read the docs View on GitHub

python

typescript

cURL

from blockdata import Workbench

wb = Workbench(project="contracts-q3")

# upload, parse, and extract in one run
run = wb.pipeline(
    assets="./msas/*.pdf",
    schema="msa.v3",
    classify=["MSA", "NDA"],
    on_low_confidence="review",
)

for doc in run.results:
    print(doc.fields.counterparty,
          doc.fields.effective_date,
          doc.confidence)

# every output is cited & re-runnable
run.rerun(step="extract", schema="msa.v4")

Security

The boring questions, answered up front.

Your documents stay in your tenant. Your model calls stay on the path you choose. No training on your data, ever.

ComplianceSOC 2 II

ComplianceHIPAA

ComplianceISO 27001

ResidencyUS · EU · UK

EncryptionAES‑256

DeploySelf‑host

ModelsBYO keys

AuditFull run log

Ready when you are

Ship document work you can defend.

Start free on the workbench. Hit our API the same day. Bring your enterprise documents when you're ready.

Start free Book a demo

Tweaks close

Theme

Accent

Density

Headline