Extract named entities from text using a custom-defined schema.
/extract endpoint runs Named Entity Recognition (NER) over a block of text. Unlike standard NER, you define the entity types you want in plain English — the model uses your descriptions to find matching spans, returning each one with a confidence score and surrounding context.
Every result includes up to 500 characters of surrounding text on each side, so you can validate or use the extracted value without going back to the source.
description, threshold, and top_n fields that override the global values for that entity type only0 returns all results above the threshold. Can be overridden per entity type.| Status | Meaning |
|---|---|
400 Bad Request | Malformed request body, missing required fields, or empty entities map. |
401 Unauthorized | Missing or invalid x-api-key. |
429 Too Many Requests | Rate limit exceeded. Back off and retry. |
500 Internal Server Error | Inference service unavailable. |
x-api-key header.
threshold and top_n when different entity types need different precision, or when you only want the single best match for a given type.
Name uses threshold 0.3 and returns at most 1 resultCompany uses threshold 0.7 and returns up to 5 results (global top_n)Email uses the global threshold 0.5 and returns up to 5 resultsperson, company) or Title Case (e.g. Person, Company) produces better results than ALL_CAPS or other conventions.
Be specific with names, and test synonyms. The entity name itself influences what the model looks for. person and full name will find slightly different things. If results are missing or noisy, try rephrasing the name — person name, individual, or full name may all behave differently on your data.
Labels can be descriptive phrases, not just single words. Instead of city, use capital city and population center. The extra context helps the model distinguish between entity types that might otherwise overlap.
Descriptions can be full instructions. Rather than "Name of the person", write "Find the first and last name of the person mentioned in the text". Instruction-style descriptions consistently outperform short noun phrases on complex or ambiguous entities.
Avoid mixing overlapping granularities in the same call. If you include both location and city, the model has to decide which label to assign to a city — and will often split results unpredictably between them. Pick one level of granularity per concept.
Examples:
| Instead of | Use |
|---|---|
CITY | city or City |
city + location in the same call | just city or just location |
"Name" | "Find the first and last name of the person in the text" |
"city" (when you want capitals specifically) | "capital city and population center" |
top_n is applied.context field is always derived from the original text input — it is not generated by the model.start, end) refer to byte positions in the original text string.Successful extraction