Skip to content
SIE

Overview

The extract primitive pulls structured data from unstructured content. It handles named entity recognition, relation extraction, text classification, and vision tasks like captioning and OCR.

from sie_sdk import SIEClient
from sie_sdk.types import Item
client = SIEClient("http://localhost:8080")
text = Item(text="Apple CEO Tim Cook announced the iPhone 16 in Cupertino.")
result = client.extract(
"urchade/gliner_multi-v2.1",
text,
labels=["person", "organization", "product", "location"]
)
for entity in result["entities"]:
print(f"{entity['label']}: {entity['text']} (score: {entity['score']:.2f})")
# organization: Apple (score: 0.95)
# person: Tim Cook (score: 0.93)
# product: iPhone 16 (score: 0.89)
# location: Cupertino (score: 0.87)

GLiNER models extract entities with zero-shot label support. Define your own entity types at query time.

No predefined schema. Specify any labels you need:

# Domain-specific entities
result = client.extract(
"urchade/gliner_multi-v2.1",
Item(text="The merger between Acme Corp and Beta Inc requires FTC approval."),
labels=["company", "regulatory_body", "legal_action"]
)
for entity in result["entities"]:
print(f"{entity['label']}: {entity['text']}")
# company: Acme Corp
# company: Beta Inc
# regulatory_body: FTC

Entities include character positions for highlighting or further processing:

result = client.extract(
"urchade/gliner_multi-v2.1",
Item(text="Tim Cook works at Apple."),
labels=["person", "organization"]
)
for entity in result["entities"]:
print(f"{entity['label']}: '{entity['text']}' at positions [{entity['start']}:{entity['end']}]")
# person: 'Tim Cook' at positions [0:8]
# organization: 'Apple' at positions [18:23]

Process multiple documents efficiently:

documents = [
Item(id="doc-1", text="Microsoft acquired Activision for $69 billion."),
Item(id="doc-2", text="Sundar Pichai leads Google's AI initiatives."),
]
results = client.extract(
"urchade/gliner_multi-v2.1",
documents,
labels=["person", "organization", "money"]
)
for result in results:
print(f"\n{result['id']}:")
for entity in result["entities"]:
print(f" {entity['label']}: {entity['text']}")

GLiREL models extract relationships between entities:

result = client.extract(
"jackboyla/glirel-large-v0",
Item(text="Tim Cook is the CEO of Apple Inc."),
labels=["person", "organization"],
# Relation types to extract
output_schema={"relation_types": ["works_for", "ceo_of", "founded"]}
)
for relation in result["relations"]:
print(f"{relation['head']} --{relation['relation']}--> {relation['tail']}")
# Tim Cook --ceo_of--> Apple Inc.

GLiClass models classify text into categories:

result = client.extract(
"knowledgator/gliclass-base-v1.0",
Item(text="I absolutely loved this movie! The acting was superb."),
labels=["positive", "negative", "neutral"]
)
for classification in result["classifications"]:
print(f"{classification['label']}: {classification['score']:.2f}")
# positive: 0.94
# neutral: 0.04
# negative: 0.02

Florence-2 and Donut models extract structured data from images.

result = client.extract(
"microsoft/Florence-2-base",
Item(images=[{"data": image_bytes, "format": "jpeg"}]),
instruction="<CAPTION>"
)
print(result["data"]["caption"])
# "A golden retriever playing fetch in a park on a sunny day."
result = client.extract(
"microsoft/Florence-2-base",
Item(images=[{"data": document_image, "format": "png"}]),
instruction="<OCR>"
)
print(result["data"]["text"])
# Extracted text from the document image
result = client.extract(
"naver-clova-ix/donut-base-finetuned-docvqa",
Item(images=[{"data": receipt_image, "format": "jpeg"}]),
instruction="What is the total amount?"
)
print(result["data"]["answer"])
# "$42.50"

The ExtractResult contains different fields based on the extraction type:

FieldTypeWhen Present
idstr | NoneAlways (if provided)
entitieslist[Entity]NER models (GLiNER)
relationslist[Relation]Relation extraction (GLiREL)
classificationslist[Classification]Classification models (GLiClass)
objectslist[DetectedObject]Object detection (GroundingDINO, OWLv2)
datadictVision models (captions, OCR, answers)
FieldTypeDescription
textstrExtracted text span
labelstrEntity type
scorefloatConfidence score (0-1)
startintStart character position
endintEnd character position
FieldTypeDescription
headstrSource entity
tailstrTarget entity
relationstrRelation type
scorefloatConfidence score
ModelLanguagesNotes
urchade/gliner_multi-v2.1MultilingualGeneral-purpose NER
urchade/gliner_large-v2.1EnglishLarger model (459M params)
numind/NuNER_ZeroEnglishZero-shot NER
urchade/gliner_multi_pii-v1MultilingualPII detection
ModelNotes
knowledgator/gliclass-base-v1.0Zero-shot classification
knowledgator/gliclass-small-v1.0Faster, smaller
ModelTasks
microsoft/Florence-2-baseCaption, OCR, detection
microsoft/Florence-2-largeHigher quality Florence-2
naver-clova-ix/donut-base-finetuned-docvqaDocument question answering
naver-clova-ix/donut-base-finetuned-cord-v2Receipt parsing

See Extraction Models for the complete catalog.

The server defaults to msgpack. For JSON responses:

Terminal window
curl -X POST http://localhost:8080/v1/extract/urchade/gliner_multi-v2.1 \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-d '{
"items": [{"text": "Tim Cook is the CEO of Apple."}],
"params": {"labels": ["person", "organization"]}
}'

Response:

{
"model": "urchade/gliner_multi-v2.1",
"items": [
{
"entities": [
{"text": "Tim Cook", "label": "person", "score": 0.93, "start": 0, "end": 8},
{"text": "Apple", "label": "organization", "score": 0.95, "start": 24, "end": 29}
]
}
]
}