Skip to main content
POST
/
extract
Extract
curl --request POST \
  --url https://api.scaledown.xyz/extract \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '
{
  "text": "<string>",
  "entities": {},
  "threshold": 0.5,
  "top_n": 0
}
'
{
  "entities": [
    {
      "text": "<string>",
      "type": "<string>",
      "confidence": 123,
      "start": 123,
      "end": 123,
      "context": "<string>"
    }
  ]
}

Overview

The /extract endpoint runs Named Entity Recognition (NER) over a block of text. Unlike standard NER, you define the entity types you want in plain English — the model uses your descriptions to find matching spans, returning each one with a confidence score and surrounding context. Every result includes up to 500 characters of surrounding text on each side, so you can validate or use the extracted value without going back to the source.

Request

text
string
The input text to extract entities from. Can be a full document, web page content, article, or any plain text string. Either text or document must be provided. If both are given, the OCR text is appended after text.
document
string
A base64-encoded file to extract entities from. Supported formats: JPEG, PNG, TIFF, single-page PDF, multi-page PDF. The file is processed via AWS Textract OCR and the extracted text is used as input. Either text or document must be provided.
document_mime_type
string
MIME type of the document (e.g. "image/jpeg", "application/pdf"). Required when document is provided.
instruction
string
Optional global instruction prepended to the text before extraction. Use this to provide rules that apply across all entity types — for example, deduplication logic, ranking constraints, or output format requirements. This is separate from per-entity descriptions.
entities
object
required
A mapping of entity type names to their definition. Each value can be one of:
  • A plain string — a description of what to look for
  • An object — with optional description, threshold, and top_n fields
  • A nested object — defining a structured sub-schema (object with named fields)
  • An array of objects — defining a repeated structured schema (e.g. a list of line items)
threshold
number
Global confidence threshold (0–1). Entities below this score are filtered out. Can be overridden per entity type.
top_n
number
default:0
Global limit on how many results to return per entity type, ranked by confidence descending. 0 returns all results above the threshold. Can be overridden per entity type.

Response

entities
array
List of extracted scalar entities, sorted by confidence descending within each type. For nested or array entity types, values appear in structured_result instead.
structured_result
object | null
Present when any entity in the request uses a nested object or array schema. Contains the full structured extraction result keyed by entity name. Scalar fields from the same request also appear here alongside their nested counterparts. null for flat extraction requests.
ocr_text
string | null
The raw text extracted from the document via OCR. null if no document was provided.

Error responses

StatusMeaning
422 Unprocessable EntityMalformed request body, neither text nor document provided, or OCR failed.
500 Internal Server ErrorInference service unavailable.
504 Gateway TimeoutExtraction request timed out.

Authentication

Include your API key in every request using the x-api-key header.
-H "x-api-key: <your-api-key>"

Examples

Basic extraction

curl -X POST https://api.scaledown.xyz/extract \
  -H "Content-Type: application/json" \
  -H "x-api-key: <your-api-key>" \
  -d '{
    "text": "Henry Wang is a CS student from the SF Bay Area. You can find him on Twitter at @henryw and Instagram at @b0i.",
    "entities": {
      "Name": "Full name of the person",
      "Twitter": "Twitter or X handle",
      "Instagram": "Instagram username"
    }
  }'
Response:
{
  "entities": [
    {
      "text": "Henry Wang",
      "type": "Name",
      "confidence": 0.994,
      "start": 0,
      "end": 10,
      "context": "Henry Wang is a CS student from the SF Bay Area. You can find him on Twitter at @henryw and Instagram at @b0i."
    },
    {
      "text": "@henryw",
      "type": "Twitter",
      "confidence": 0.976,
      "start": 79,
      "end": 86,
      "context": "Henry Wang is a CS student from the SF Bay Area. You can find him on Twitter at @henryw and Instagram at @b0i."
    },
    {
      "text": "@b0i",
      "type": "Instagram",
      "confidence": 0.978,
      "start": 104,
      "end": 108,
      "context": "Henry Wang is a CS student from the SF Bay Area. You can find him on Twitter at @henryw and Instagram at @b0i."
    }
  ]
}

Extracting from a document

Pass a base64-encoded image or PDF in the document field. OCR is performed automatically and the extracted text is used as the input. The raw OCR output is returned as ocr_text.
DOCUMENT=$(base64 -b 0 -i contract.pdf)

curl -X POST https://api.scaledown.xyz/extract \
  -H "Content-Type: application/json" \
  -H "x-api-key: <your-api-key>" \
  -d '{
    "document": "'"$DOCUMENT"'",
    "document_mime_type": "application/pdf",
    "entities": {
      "party_name": "Name of a party to the contract",
      "effective_date": "The date the contract takes effect",
      "governing_law": "The jurisdiction or governing law clause"
    }
  }'

Structured (nested) extraction

For more complex documents, you can define nested schemas to extract structured objects or arrays of objects. Use a nested object to extract a single structured group of fields, or an array of objects to extract a repeated structure such as invoice line items. The full structured output is returned in the structured_result field. Scalar fields in the same request are also included there, alongside any nested values. Nested object example — extract a single structured address:
curl -X POST https://api.scaledown.xyz/extract \
  -H "Content-Type: application/json" \
  -H "x-api-key: <your-api-key>" \
  -d '{
    "text": "Ship to: Jane Smith, 42 Maple Street, Springfield, IL 62701.",
    "entities": {
      "recipient": "The full name of the recipient",
      "address": {
        "street": "Street address including number",
        "city": "City name",
        "state": "Two-letter state code",
        "zip": "ZIP or postal code"
      }
    }
  }'
Response:
{
  "entities": [
    {
      "text": "Jane Smith",
      "type": "recipient",
      "confidence": 1.0,
      "start": 9,
      "end": 19,
      "context": "Ship to: Jane Smith, 42 Maple Street, Springfield, IL 62701."
    }
  ],
  "structured_result": {
    "recipient": "Jane Smith",
    "address": {
      "street": "42 Maple Street",
      "city": "Springfield",
      "state": "IL",
      "zip": "62701"
    }
  },
  "ocr_text": null
}

Array schema example — extract invoice line items:
curl -X POST https://api.scaledown.xyz/extract \
  -H "Content-Type: application/json" \
  -H "x-api-key: <your-api-key>" \
  -d '{
    "text": "Invoice: 1x Widget A @ $10.00, 3x Widget B @ $5.00, 1x Shipping @ $8.50",
    "entities": {
      "vendor": "The name of the vendor or supplier",
      "line_items": [
        {
          "description": "Description or name of the line item",
          "quantity": "Quantity ordered",
          "unit_price": "Price per unit"
        }
      ]
    }
  }'
Response:
{
  "entities": [],
  "structured_result": {
    "vendor": null,
    "line_items": [
      { "description": "Widget A", "quantity": "1", "unit_price": "$10.00" },
      { "description": "Widget B", "quantity": "3", "unit_price": "$5.00" },
      { "description": "Shipping", "quantity": "1", "unit_price": "$8.50" }
    ]
  },
  "ocr_text": null
}
When using array schemas, array fields appear only in structured_result. The entities array contains only scalar fields from the same request that could be matched to a span in the text.

With per-entity overrides

Use per-entity threshold and top_n when different entity types need different precision, or when you only want the single best match for a given type.
curl -X POST https://api.scaledown.xyz/extract \
  -H "Content-Type: application/json" \
  -H "x-api-key: <your-api-key>" \
  -d '{
    "text": "...",
    "entities": {
      "Name": {
        "description": "Full name of a person",
        "threshold": 0.3,
        "top_n": 1
      },
      "Company": {
        "description": "Company or organization name",
        "threshold": 0.7
      },
      "Email": "Email address"
    },
    "threshold": 0.5,
    "top_n": 5
  }'
In this example:
  • Name uses threshold 0.3 and returns at most 1 result
  • Company uses threshold 0.7 and returns up to 5 results (global top_n)
  • Email uses the global threshold 0.5 and returns up to 5 results

Writing good entity labels

The entity name and description are both used as part of the model’s search criteria — wording them well is the biggest lever you have on extraction quality. Use lowercase or Title Case. The model was trained with lowercase labels. Keeping your entity names lowercase (e.g. person, company) or Title Case (e.g. Person, Company) produces better results than ALL_CAPS or other conventions. Be specific with names, and test synonyms. The entity name itself influences what the model looks for. person and full name will find slightly different things. If results are missing or noisy, try rephrasing the name — person name, individual, or full name may all behave differently on your data. Labels can be descriptive phrases, not just single words. Instead of city, use capital city and population center. The extra context helps the model distinguish between entity types that might otherwise overlap. Descriptions can be full instructions. Rather than "Name of the person", write "Find the first and last name of the person mentioned in the text". Instruction-style descriptions consistently outperform short noun phrases on complex or ambiguous entities. Avoid mixing overlapping granularities in the same call. If you include both location and city, the model has to decide which label to assign to a city — and will often split results unpredictably between them. Pick one level of granularity per concept. Examples:
Instead ofUse
CITYcity or City
city + location in the same calljust city or just location
"Name""Find the first and last name of the person in the text"
"city" (when you want capitals specifically)"capital city and population center"

Notes

  • Results within each entity type are ranked by confidence descending before top_n is applied.
  • The context field is always derived from the original text input — it is not generated by the model.
  • Character offsets (start, end) refer to byte positions in the original text string.
  • There is no fixed limit on the number of entity types you can define in a single request.

Authorizations

x-api-key
string
header
required

Body

application/json
text
string
required

The input text to extract entities from.

entities
object
required

A mapping of entity type names to their definition.

threshold
number
default:0.5

Global confidence threshold (0–1).

top_n
number
default:0

Global limit on results per entity type. 0 returns all above threshold.

Response

Successful extraction

entities
object[]