The document intelligence market is booming. Companies like Reducto.ai are gaining traction by helping enterprises extract structured data from PDFs and scanned documents. Their value proposition is compelling: upload a document, define a schema, get clean JSON back.

But if you want product-level control — custom routing, storage, compliance, and tight integration with the rest of your file workflows — it’s often more powerful to build your own Reducto-style experience on top of primitives.

Transloadit is those primitives. In this guide, we’ll show you how to combine our document processing Robots with 🤖 /ai/chat to create a flexible, schema-driven pipeline you can shape into your own document AI product.

If you want a turnkey product, a dedicated document AI vendor can be a great fit. But if you need to blend document extraction with uploads, conversions, storage, and downstream workflows, building on Transloadit gives you more leverage.

What document intelligence really means

At its core, document intelligence involves three key capabilities:

  1. Parse — extract text from documents using OCR, preserving layout and structure.
  2. Split — break multi-page documents into manageable chunks.
  3. Extract — pull structured data that matches a predefined schema.

Let’s build each of these with Transloadit.

Setting up your TypeScript project

First, set up a project using the Transloadit Node SDK:

yarn init -y
yarn add transloadit

Note
The v4 Node SDK requires Node.js 20 or newer. The official Node SDK docs currently cover v3. If you want the TypeScript-first v4 release, install transloadit@^4.0.0 as shown in the v4 announcement.

Create your client:

import { Transloadit } from 'transloadit'

const transloadit = new Transloadit({
  authKey: process.env.TRANSLOADIT_AUTH_KEY!,
  authSecret: process.env.TRANSLOADIT_AUTH_SECRET!,
})

Step 1: Document parsing with OCR (optional)

The 🤖 /document/ocr Robot extracts text from PDFs, including scanned PDFs. It supports multiple providers and can return results with layout coordinates or plain text. If your source isn’t a PDF, convert it first with 🤖 /document/convert.

If you are using a PDF-capable model like Claude Sonnet 4, you can skip OCR and send the PDF directly to 🤖 /ai/chat. OCR is still useful when you need layout coordinates or want to normalize non-PDF files first.

const parseResult = await transloadit.createAssembly({
  params: {
    steps: {
      ocr_extract: {
        robot: '/document/ocr',
        use: ':original',
        provider: 'gcp',
        format: 'json',
        granularity: 'full',
        result: true,
      },
    },
  },
  files: {
    document: './invoice.pdf',
  },
  waitForCompletion: true,
})

const ocrResults = parseResult.results.ocr_extract

The granularity: 'full' option returns bounding box coordinates for each text block, which is useful for understanding layout.

Step 2: Document splitting

For large documents, the 🤖 /document/split Robot lets you extract specific pages:

const splitResult = await transloadit.createAssembly({
  params: {
    steps: {
      first_pages: {
        robot: '/document/split',
        use: ':original',
        pages: ['1-5'],
      },
      remaining_pages: {
        robot: '/document/split',
        use: ':original',
        pages: ['6-'],
      },
    },
  },
  files: {
    document: './large-report.pdf',
  },
  waitForCompletion: true,
})

Step 3: Schema-driven data extraction with AI

Here’s where the magic happens. The 🤖 /ai/chat Robot can process documents and return structured JSON that matches your schema. This is directly comparable to Reducto’s Extract API. Claude Sonnet 4 supports PDFs, so we’ll use model: 'anthropic/claude-4-sonnet-20250514' below. When you set format: 'json', the output is a JSON file in the Assembly results.

Zod v4 ships a native z.toJSONSchema() helper. The snippets below use a small helper that calls it when available and falls back to zod-to-json-schema for Zod v3 projects.

If your account does not have shared AI credentials configured, create AI template credentials in the Transloadit dashboard (for OpenAI, Anthropic, or Google) and reference them via credentials.

For a quick start, you can omit credentials and set test_credentials: true to use Transloadit-provided test keys. That is convenient for demos, but shared keys can be rate-limited, so production workloads should supply their own credentials.

import { z } from 'zod'
import zodToJsonSchema from 'zod-to-json-schema'

const toJsonSchema = (schema: z.ZodTypeAny) =>
  typeof (z as { toJSONSchema?: (schema: z.ZodTypeAny) => unknown }).toJSONSchema === 'function'
    ? (z as { toJSONSchema: (schema: z.ZodTypeAny) => unknown }).toJSONSchema(schema)
    : zodToJsonSchema(schema)

const invoiceSchema = z.object({
  invoice_number: z.string(),
  vendor_name: z.string(),
  vendor_address: z.string().optional(),
  invoice_date: z.string().optional(),
  due_date: z.string().optional(),
  total_amount: z.number(),
  currency: z.string().optional(),
  line_items: z
    .array(
      z.object({
        description: z.string(),
        quantity: z.number().optional(),
        unit_price: z.number().optional(),
        total: z.number().optional(),
      }),
    )
    .optional(),
  tax_amount: z.number().optional(),
  payment_terms: z.string().optional(),
})

const extractionResult = await transloadit.createAssembly({
  params: {
    steps: {
      extract_data: {
        robot: '/ai/chat',
        use: ':original',
        credentials: 'my_ai_credentials',
        model: 'anthropic/claude-4-sonnet-20250514',
        format: 'json',
        schema: JSON.stringify(toJsonSchema(invoiceSchema)),
        messages: `Extract all invoice data from this document.
Be precise with amounts and dates.
If a field is not present, omit it from the response.`,
        result: true,
      },
    },
  },
  files: {
    invoice: './invoice.pdf',
  },
  waitForCompletion: true,
})

const extractedFile = extractionResult.results.extract_data[0]
const invoiceData = await fetch(extractedFile.ssl_url).then((response) => response.json())
console.log(`Invoice #${invoiceData.invoice_number}: $${invoiceData.total_amount}`)

Build a complete pipeline

Now let’s combine everything into a production-ready pipeline that:

  • optionally extracts text with OCR,
  • splits large files,
  • extracts structured data with AI, and
  • stores results in S3.
import { z } from 'zod'
import zodToJsonSchema from 'zod-to-json-schema'
import { Transloadit } from 'transloadit'

const toJsonSchema = (schema: z.ZodTypeAny) =>
  typeof (z as { toJSONSchema?: (schema: z.ZodTypeAny) => unknown }).toJSONSchema === 'function'
    ? (z as { toJSONSchema: (schema: z.ZodTypeAny) => unknown }).toJSONSchema(schema)
    : zodToJsonSchema(schema)

const financialDocumentSchema = z.object({
  document_type: z.enum(['invoice', 'receipt', 'statement', 'contract']),
  document_date: z.string(),
  parties: z.array(
    z.object({
      name: z.string(),
      role: z.enum(['vendor', 'customer', 'signatory']),
      address: z.string().optional(),
    }),
  ),
  amounts: z.array(
    z.object({
      description: z.string(),
      value: z.number(),
      currency: z.string(),
    }),
  ),
  key_terms: z.array(z.string()).optional(),
  summary: z.string(),
})

type FinancialDocument = z.infer<typeof financialDocumentSchema>

const transloadit = new Transloadit({
  authKey: process.env.TRANSLOADIT_AUTH_KEY!,
  authSecret: process.env.TRANSLOADIT_AUTH_SECRET!,
})

async function processFinancialDocument(filePath: string) {
  const result = await transloadit.createAssembly({
    params: {
      steps: {
        pdf_verified: {
          robot: '/file/filter',
          use: ':original',
          accepts: [['${file.mime}', '==', 'application/pdf']],
        },
        non_pdf: {
          robot: '/file/filter',
          use: ':original',
          accepts: [['${file.mime}', '!=', 'application/pdf']],
        },
        pdf_converted: {
          robot: '/document/convert',
          use: 'non_pdf',
          format: 'pdf',
        },
        extract_structured: {
          robot: '/ai/chat',
          use: ['pdf_verified', 'pdf_converted'],
          credentials: 'ai_credentials',
          model: 'anthropic/claude-4-sonnet-20250514',
          format: 'json',
          schema: JSON.stringify(toJsonSchema(financialDocumentSchema)),
          messages: 'Extract the structured data from this document.',
          result: true,
        },
        store_results: {
          robot: '/s3/store',
          use: [':original', 'extract_structured'],
          credentials: 's3_credentials',
          path: 'documents/${file.name}/',
        },
      },
    },
    files: {
      document: filePath,
    },
    waitForCompletion: true,
  })

  if (result.ok !== 'ASSEMBLY_COMPLETED') {
    throw new Error(`Assembly failed: ${result.error}`)
  }

  const extractedFile = result.results.extract_structured[0]
  return fetch(extractedFile.ssl_url).then((response) => response.json())
}

const data = await processFinancialDocument('./contract.pdf')
console.log(`Processed ${data.document_type}: ${data.summary}`)

The /document/convert → PDF path supports the following input types:

  • Word: .doc, .docx
  • PowerPoint: .ppt, .pptx, .pps, .ppz, .pot
  • Excel: .xls, .xlsx, .xla
  • OpenDocument: .odt, .ott, .odd, .oda
  • Web/markup: .html, .xhtml, .xml, Markdown (.md)
  • Text & rich text: .txt, .csv, .rtf, .rtx, .tex/LaTeX
  • Images: .jpg, .jpeg, .png, .gif, .svg
  • Vector/print: .ai, .eps, .ps

If you need OCR output for layout-aware workflows, insert a /document/ocr step and change use: ':original' to use: 'ocr_text', then include that step in your storage targets.

Process multiple document types

With the 🤖 /file/filter Robot, you can route PDFs directly to the model while converting everything else to PDF first. Using != makes the non‑PDF branch explicit:

// Reuse invoiceSchema + toJsonSchema from above.
const multiTypeResult = await transloadit.createAssembly({
  params: {
    steps: {
      pdf_verified: {
        robot: '/file/filter',
        use: ':original',
        accepts: [['${file.mime}', '==', 'application/pdf']],
      },
      non_pdf: {
        robot: '/file/filter',
        use: ':original',
        accepts: [['${file.mime}', '!=', 'application/pdf']],
      },
      pdf_converted: {
        robot: '/document/convert',
        use: 'non_pdf',
        format: 'pdf',
      },
      extract_data: {
        robot: '/ai/chat',
        use: ['pdf_verified', 'pdf_converted'],
        credentials: 'ai_credentials',
        model: 'anthropic/claude-4-sonnet-20250514',
        format: 'json',
        schema: JSON.stringify(toJsonSchema(invoiceSchema)),
        messages: 'Extract data from this document.',
        result: true,
      },
    },
  },
  files: {
    doc1: './receipt.pdf',
    doc2: './receipt-photo.jpg',
  },
  waitForCompletion: true,
})

Flow overview:

:original
  ├─ pdf_verified (file/filter: mime == pdf) ──▶ /ai/chat
  └─ non_pdf (file/filter: mime != pdf) ──▶ /document/convert (pdf) ──▶ /ai/chat

Use Templates for reusability

For production use, save your Assembly Instructions as a Template and reference it by ID:

const result = await transloadit.createAssembly({
  params: {
    template_id: 'your-invoice-extraction-template',
    fields: {
      custom_prompt: 'Focus on extracting payment terms and due dates.',
    },
  },
  files: {
    document: './invoice.pdf',
  },
  waitForCompletion: true,
})

Handle large document batches

For processing many documents efficiently, keep concurrency limited:

import pMap from 'p-map'

async function processBatch(files: string[]): Promise<Map<string, FinancialDocument>> {
  const concurrency = 5

  const batchResults = await pMap(
    files,
    async (file) => {
      const result = await transloadit.createAssembly({
        params: {
          template_id: 'document-extraction-template',
        },
        files: { document: file },
        waitForCompletion: true,
      })

      // Assumes the template includes a step named `extract_structured`.
      const extractedFile = result.results.extract_structured[0]
      const data = await fetch(extractedFile.ssl_url).then((response) => response.json())

      return {
        file,
        data,
      }
    },
    { concurrency },
  )

  return new Map(batchResults.map(({ file, data }) => [file, data]))
}

Why teams choose Transloadit for document AI

  • One API covers ingest, conversion, OCR, AI extraction, and delivery.
  • Import/export integrations for cloud storage (S3, Azure, Google Cloud Storage, Dropbox, and more).
  • Assembly Instructions let you version, reuse, and branch pipelines without glue code.
  • A single vendor for documents and broader file workloads (previews, thumbnails, virus scanning, image/audio/video processing).

Transloadit vs. Dedicated document AI platforms

Think of Reducto as a focused product and Transloadit as a composable platform. You can replicate the core extraction flow and then extend it with everything around it.

Feature Transloadit Reducto.ai
OCR / text extraction 🤖 /document/ocr (PDFs) ✅ Parse API
Document splitting 🤖 /document/split ✅ Split API
Schema-based extraction 🤖 /ai/chat with JSON schema ✅ Extract API
Storage integrations ✅ Import/export to cloud storage External tooling
Workflow orchestration ✅ Assembly Instructions + Templates External tooling
AI providers ✅ OpenAI, Anthropic, Google (with credentials)
Broader file workloads ✅ Image/video/audio + previews + security Document-focused stack

When to use this approach

This Transloadit-based approach is a strong fit when:

  • you want to build your own document AI product or internal platform,
  • you want one vendor for ingest, conversion, extraction, and delivery,
  • you need to combine document intelligence with broader media workflows, and
  • you want flexibility in choosing AI providers and predictable pricing.

Ready to build?

Document intelligence doesn’t have to mean adding another specialized vendor. With 🤖 /document/ocr, 🤖 /document/split, and 🤖 /ai/chat, you can build sophisticated extraction pipelines that rival dedicated platforms.

Start by exploring our AI Robot docs and see how far you can take your document workflows inside one platform.