Overview

The extract primitive pulls structured data from unstructured content. It handles named entity recognition, relation extraction, text classification, and vision tasks like captioning and OCR.

Quick Example

Python
TypeScript

from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")

text = Item(text="Apple CEO Tim Cook announced the iPhone 16 in Cupertino.")

result = client.extract(
    "urchade/gliner_multi-v2.1",
    text,
    labels=["person", "organization", "product", "location"]
)

for entity in result["entities"]:
    print(f"{entity['label']}: {entity['text']} (score: {entity['score']:.2f})")
# organization: Apple (score: 0.95)
# person: Tim Cook (score: 0.93)
# product: iPhone 16 (score: 0.89)
# location: Cupertino (score: 0.87)

import { SIEClient } from "@sie/sdk";

const client = new SIEClient("http://localhost:8080");

const text = { text: "Apple CEO Tim Cook announced the iPhone 16 in Cupertino." };

const result = await client.extract(
  "urchade/gliner_multi-v2.1",
  text,
  { labels: ["person", "organization", "product", "location"] }
);

for (const entity of result.entities) {
  console.log(`${entity.label}: ${entity.text} (score: ${entity.score.toFixed(2)})`);
}
// organization: Apple (score: 0.95)
// person: Tim Cook (score: 0.93)
// product: iPhone 16 (score: 0.89)
// location: Cupertino (score: 0.87)

await client.close();

Named Entity Recognition (NER)

GLiNER models extract entities with zero-shot label support. Define your own entity types at query time.

Custom Entity Types

No predefined schema. Specify any labels you need:

Python
TypeScript

# Domain-specific entities
result = client.extract(
    "urchade/gliner_multi-v2.1",
    Item(text="The merger between Acme Corp and Beta Inc requires FTC approval."),
    labels=["company", "regulatory_body", "legal_action"]
)

for entity in result["entities"]:
    print(f"{entity['label']}: {entity['text']}")
# company: Acme Corp
# company: Beta Inc
# regulatory_body: FTC

// Domain-specific entities
const result = await client.extract(
  "urchade/gliner_multi-v2.1",
  { text: "The merger between Acme Corp and Beta Inc requires FTC approval." },
  { labels: ["company", "regulatory_body", "legal_action"] }
);

for (const entity of result.entities) {
  console.log(`${entity.label}: ${entity.text}`);
}
// company: Acme Corp
// company: Beta Inc
// regulatory_body: FTC

Entity Positions

Entities include character positions for highlighting or further processing:

Python
TypeScript

result = client.extract(
    "urchade/gliner_multi-v2.1",
    Item(text="Tim Cook works at Apple."),
    labels=["person", "organization"]
)

for entity in result["entities"]:
    print(f"{entity['label']}: '{entity['text']}' at positions [{entity['start']}:{entity['end']}]")
# person: 'Tim Cook' at positions [0:8]
# organization: 'Apple' at positions [18:23]

const result = await client.extract(
  "urchade/gliner_multi-v2.1",
  { text: "Tim Cook works at Apple." },
  { labels: ["person", "organization"] }
);

for (const entity of result.entities) {
  console.log(`${entity.label}: '${entity.text}' at positions [${entity.start}:${entity.end}]`);
}
// person: 'Tim Cook' at positions [0:8]
// organization: 'Apple' at positions [18:23]

Batch Extraction

Process multiple documents efficiently:

Python
TypeScript

documents = [
    Item(id="doc-1", text="Microsoft acquired Activision for $69 billion."),
    Item(id="doc-2", text="Sundar Pichai leads Google's AI initiatives."),
]

results = client.extract(
    "urchade/gliner_multi-v2.1",
    documents,
    labels=["person", "organization", "money"]
)

for result in results:
    print(f"\n{result['id']}:")
    for entity in result["entities"]:
        print(f"  {entity['label']}: {entity['text']}")

const documents = [
  { id: "doc-1", text: "Microsoft acquired Activision for $69 billion." },
  { id: "doc-2", text: "Sundar Pichai leads Google's AI initiatives." },
];

const results = await client.extract(
  "urchade/gliner_multi-v2.1",
  documents,
  { labels: ["person", "organization", "money"] }
);

for (const result of results) {
  console.log(`\n${result.id}:`);
  for (const entity of result.entities) {
    console.log(`  ${entity.label}: ${entity.text}`);
  }
}

Relation Extraction

GLiREL models extract relationships between entities:

Python
TypeScript

result = client.extract(
    "jackboyla/glirel-large-v0",
    Item(text="Tim Cook is the CEO of Apple Inc."),
    labels=["person", "organization"],
    # Relation types to extract
    output_schema={"relation_types": ["works_for", "ceo_of", "founded"]}
)

for relation in result["relations"]:
    print(f"{relation['head']} --{relation['relation']}--> {relation['tail']}")
# Tim Cook --ceo_of--> Apple Inc.

const result = await client.extract(
  "jackboyla/glirel-large-v0",
  { text: "Tim Cook is the CEO of Apple Inc." },
  {
    labels: ["person", "organization"],
    // Relation types to extract (passed via options)
  }
);

// Note: Relations are returned in the result object
// TypeScript SDK returns entities; relation extraction may require
// additional configuration based on model capabilities
for (const entity of result.entities) {
  console.log(`${entity.label}: ${entity.text}`);
}

Text Classification

GLiClass models classify text into categories:

Python
TypeScript

result = client.extract(
    "knowledgator/gliclass-base-v1.0",
    Item(text="I absolutely loved this movie! The acting was superb."),
    labels=["positive", "negative", "neutral"]
)

for classification in result["classifications"]:
    print(f"{classification['label']}: {classification['score']:.2f}")
# positive: 0.94
# neutral: 0.04
# negative: 0.02

const result = await client.extract(
  "knowledgator/gliclass-base-v1.0",
  { text: "I absolutely loved this movie! The acting was superb." },
  { labels: ["positive", "negative", "neutral"] }
);

// Classification results are returned as entities with label scores
for (const entity of result.entities) {
  console.log(`${entity.label}: ${entity.score.toFixed(2)}`);
}
// positive: 0.94
// neutral: 0.04
// negative: 0.02

Vision Tasks

Florence-2 and Donut models extract structured data from images.

result = client.extract(
    "microsoft/Florence-2-base",
    Item(images=[{"data": image_bytes, "format": "jpeg"}]),
    instruction="<CAPTION>"
)

print(result["data"]["caption"])
# "A golden retriever playing fetch in a park on a sunny day."

const result = await client.extract(
  "microsoft/Florence-2-base",
  { images: [imageBytes] },  // Uint8Array of JPEG/PNG data
  { labels: ["<CAPTION>"] }  // Instruction passed via labels
);

// Vision model results vary by task
console.log(result);

OCR (Text from Images)

Python
TypeScript

result = client.extract(
    "microsoft/Florence-2-base",
    Item(images=[{"data": document_image, "format": "png"}]),
    instruction="<OCR>"
)

print(result["data"]["text"])
# Extracted text from the document image

const result = await client.extract(
  "microsoft/Florence-2-base",
  { images: [documentImage] },  // Uint8Array of PNG data
  { labels: ["<OCR>"] }
);

// OCR results contain extracted text
console.log(result);

Document Understanding

Python
TypeScript

result = client.extract(
    "naver-clova-ix/donut-base-finetuned-docvqa",
    Item(images=[{"data": receipt_image, "format": "jpeg"}]),
    instruction="What is the total amount?"
)

print(result["data"]["answer"])
# "$42.50"

const result = await client.extract(
  "naver-clova-ix/donut-base-finetuned-docvqa",
  { images: [receiptImage] },
  { labels: ["What is the total amount?"] }
);

// Document QA results contain the answer
console.log(result);

Response Format

The ExtractResult contains different fields based on the extraction type:

Field	Type	When Present
`id`	`str \| None`	Always (if provided)
`entities`	`list[Entity]`	NER models (GLiNER)
`relations`	`list[Relation]`	Relation extraction (GLiREL)
`classifications`	`list[Classification]`	Classification models (GLiClass)
`objects`	`list[DetectedObject]`	Object detection (GroundingDINO, OWLv2)
`data`	`dict`	Vision models (captions, OCR, answers)

Entity Fields

Field	Type	Description
`text`	`str`	Extracted text span
`label`	`str`	Entity type
`score`	`float`	Confidence score (0-1)
`start`	`int`	Start character position
`end`	`int`	End character position

Relation Fields

Field	Type	Description
`head`	`str`	Source entity
`tail`	`str`	Target entity
`relation`	`str`	Relation type
`score`	`float`	Confidence score

Recommended Models

NER Models

Model	Languages	Notes
`urchade/gliner_multi-v2.1`	Multilingual	General-purpose NER
`urchade/gliner_large-v2.1`	English	Larger model (459M params)
`numind/NuNER_Zero`	English	Zero-shot NER
`urchade/gliner_multi_pii-v1`	Multilingual	PII detection

Classification Models

Model	Notes
`knowledgator/gliclass-base-v1.0`	Zero-shot classification
`knowledgator/gliclass-small-v1.0`	Faster, smaller

Vision Models

Model	Tasks
`microsoft/Florence-2-base`	Caption, OCR, detection
`microsoft/Florence-2-large`	Higher quality Florence-2
`naver-clova-ix/donut-base-finetuned-docvqa`	Document question answering
`naver-clova-ix/donut-base-finetuned-cord-v2`	Receipt parsing

See Extraction Models for the complete catalog.

HTTP API

The server defaults to msgpack. For JSON responses:

curl -X POST http://localhost:8080/v1/extract/urchade/gliner_multi-v2.1 \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -d '{
    "items": [{"text": "Tim Cook is the CEO of Apple."}],
    "params": {"labels": ["person", "organization"]}
  }'

Response:

{
  "model": "urchade/gliner_multi-v2.1",
  "items": [
    {
      "entities": [
        {"text": "Tim Cook", "label": "person", "score": 0.93, "start": 0, "end": 8},
        {"text": "Apple", "label": "organization", "score": 0.95, "start": 24, "end": 29}
      ]
    }
  ]
}

What’s Next

Extraction models - complete model catalog