extractor.arkintel.com

Files in.Structured data out.

Send any file and a JSON schema. We map the file into the shape you asked for — even if the file and the schema look nothing alike. No chunking, no prompt engineering, no post-processing.

Try the playgroundHow it works

// we handle OCR, vision and metadata fusion, and schema validation. you write the schema. you send the files.

what you can send

documents

pdf
docx
doc
pptx
xlsx
csv
html
txt
md

images

jpg
png
heic
webp
tiff
bmp
gif

mail

// no audio, no video — yet

what you can send

documents

pdf
docx
doc
pptx
xlsx
csv
html
txt
md

images

jpg
png
heic
webp
tiff
bmp
gif

mail

// no audio, no video — yet

01playground

Try it now.

Pick a preset schema, drop in a sample file, and inspect the JSON we hand back. The playground uses the same extraction backend as production.

tap to activate

// presets for now. with API access you'd send your own schema in the request body.

sandboxed · no signup · nothing stored

02the engine

Six lenses on the file. One JSON in your shape.

Most extractors pick a horse — pure OCR, pure vision, or one giant LLM call — and lose what the others would have caught. We run six lanes against the same file in a single round trip, weigh them against each other, and only commit a value once they agree. The answer comes back in your schema, with your field names.

// no single method gets it right. we run six and let them argue.

// the three colours in the stack

traditional
ai
fusion

POST/v1/extract

multipart/form-data

// your schema

vendor

string

invoice_no

string

issued_at

date

due_at

date

total_eur

number

// your file

atlas_invoice.pdf

scanned · OCR

// six lenses on the same file

metadatatraditionalembedded text, EXIF, dates
ocrtraditionalprinted + handwritten characters
layouttraditionalcolumns, table cells, key/value
visionailogos, signatures, charts
reasoningaischema-aware llm pass
consensusfusioncross-checks every field

// merged answer — your shape

200 OK

{  "vendor":     "Atlas Logistics GmbH",  "invoice_no": "INV-2026-0418",  "issued_at":  "2026-04-12",  "due_at":     "2026-05-12",  "total_eur":  4280.50}

03security

Built for the files you can’t afford to leak.

Extract runs as a managed EU cloud service with zero retention. Files and responses are deleted the moment we hand back your JSON, traffic is encrypted end-to-end, and your data never trains a model.

// same engine also runs inside your own network.

arkintel cloud

Managed EU cloud

Hit /v1/extract. We handle everything else.

zero retention — files and responses deleted after we hand back your JSON
encrypted in transit (TLS) and while processing
your data is never used to train models
EU-hosted API and storage
LLM steps may use vetted third-party hosted model providers
per-tenant audit log on request

Try the playground

Self-hosted

Same engine on your own servers — same /v1/extract endpoint, open-weight models, nothing leaves your network.

How self-hosting works

Want the longer argument? Read the sovereignty brief

04the wire

One endpoint. Boring on purpose.

Multipart POST. The schema is a JSON form field, the files are file fields, the response is your data — validated, in your shape, in the same round trip. No clever protocol to learn.

// what you don’t have to do

install an SDK
wire up webhooks
speak a streaming protocol
presign upload URLs
track temporary file IDs
poll, batch, or coordinate jobs

// the whole integration fits on a postcard.

extract.sh

curl -X POST https://api.arkintel.com/v1/extract \  -H "Authorization: Bearer $ARKINTEL_API_KEY" \  -F 'schema=@invoice.schema.json;type=application/json' \  -F "files=@invoice.pdf"

05 — ship it

Ship your own schema in production.

The schemas in this playground are examples for testing. With production access you pass your own schema in every request — no whitelisting, no waiting on us. Drop us a line and we’ll get you a key.

Email usBack to arkintel.com

// typical reply within one business day.

// contact

reading inbox

Email us — humans, not a ticket queue.

contact@arkintel.com