Skip to content
Why did we open-source our inference engine? Read the post

Supported Models

SIE supports 85 pre-configured models across encode (embeddings), score (reranking), and extract (NER, relations, classification, vision). All models are quality-verified against MTEB benchmarks.

For model selection guidance, see Choosing a Model.


ModelDimsMax LengthLanguagesBundle
BAAI/bge-m310248192100+default
Alibaba-NLP/gte-Qwen2-1.5B-instruct153632768Multidefault
Alibaba-NLP/gte-Qwen2-7B-instruct358432000Multisglang
Alibaba-NLP/gte-multilingual-base768819250+default
NovaSearch/stella_en_400M_v51024512Englishdefault
NovaSearch/stella_en_1.5B_v51024512Englishdefault
intfloat/e5-small-v2384512Englishdefault
intfloat/e5-base-v2768512Englishdefault
intfloat/e5-large-v21024512Englishdefault
intfloat/multilingual-e5-large1024512100+default
intfloat/multilingual-e5-large-instruct1024512100+default
intfloat/e5-mistral-7b-instruct40964096Englishsglang
sentence-transformers/all-MiniLM-L6-v2384256Englishdefault
nomic-ai/nomic-embed-text-v2-moe7682048Englishdefault
nvidia/NV-Embed-v2409632768Englishdefault
nvidia/llama-embed-nemotron-8b40968192Englishsglang
Salesforce/SFR-Embedding-Mistral40964096Englishsglang
Salesforce/SFR-Embedding-2_R40968192Englishsglang
GritLM/GritLM-7B40968192Englishdefault
Linq-AI-Research/Linq-Embed-Mistral409632768Englishsglang
google/embeddinggemma-300m7682048Englishdefault
Qwen/Qwen3-Embedding-0.6B102432768Multidefault
Qwen/Qwen3-Embedding-4B256032768Multisglang
ModelVocab SizeMax LengthBundle
BAAI/bge-m32500028192default
naver/splade-v330522512default
naver/splade-cocondenser-selfdistil30522512default
prithivida/Splade_PP_en_v230522256default
rasyosef/splade-mini30522128default
ibm-granite/granite-embedding-30m-sparse30522512default
opensearch-project/opensearch-neural-sparse-encoding-v1default
opensearch-project/opensearch-neural-sparse-encoding-v2-distilldefault
opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distilldefault
opensearch-project/opensearch-neural-sparse-encoding-doc-v2-minidefault
opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distilldefault
opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gtedefault
ModelToken DimMax LengthBundle
jinaai/jina-colbert-v21288192default
colbert-ir/colbertv2.0128512default
answerdotai/answerai-colbert-small-v196512default
mixedbread-ai/mxbai-colbert-large-v1128512default
mixedbread-ai/mxbai-edge-colbert-v0-32m128512default
lightonai/GTE-ModernColBERT-v11288192default
lightonai/Reason-ModernColBERT1288192default
nvidia/llama-nemoretriever-colembed-3b-v11024512default
ModelDimsResolutionTaskBundle
openai/clip-vit-base-patch32512224Image+text embeddingdefault
openai/clip-vit-large-patch14768224Image+text embeddingdefault
laion/CLIP-ViT-B-32-laion2B-s34B-b79K512224Image+text embeddingdefault
laion/CLIP-ViT-H-14-laion2B-s32B-b79K1024224Image+text embeddingdefault
google/siglip-so400m-patch14-2241152224Image+text embeddingdefault
google/siglip-so400m-patch14-3841152384Image+text embeddingdefault
vidore/colpali-v1.3-hf1281024Document vision (ColBERT)default
vidore/colqwen2.5-v0.21281024Document vision (ColBERT)default

ModelMax LengthLanguagesBundle
BAAI/bge-reranker-base512Englishdefault
BAAI/bge-reranker-large512Englishdefault
BAAI/bge-reranker-v2-m38192100+default
jinaai/jina-reranker-v2-base-multilingual8192100+default
mixedbread-ai/mxbai-rerank-base-v28192Englishdefault
mixedbread-ai/mxbai-rerank-large-v28192Englishdefault
Alibaba-NLP/gte-reranker-modernbert-base8192Englishdefault
cross-encoder/ms-marco-MiniLM-L-6-v2512Englishdefault
cross-encoder/ms-marco-MiniLM-L-12-v2512Englishdefault

ColBERT models can also be used for reranking via MaxSim scoring. See the Multi-Vector section above.


ModelLanguagesNotesBundle
urchade/gliner_small-v2.1EnglishSmallgliner
urchade/gliner_medium-v2.1EnglishMediumgliner
urchade/gliner_large-v2.1EnglishLargegliner
urchade/gliner_multi-v2.1MultilingualRecommendedgliner
urchade/gliner_multi_pii-v1MultilingualPII detectiongliner
EmergentMethods/gliner_large_news-v2.1EnglishNews domaingliner
Ihor/gliner-biomed-large-v1.0EnglishBiomedicalgliner
NeuML/gliner-bert-tinyEnglishTiny, fastestgliner
numind/NuNER_ZeroEnglishZero-shotgliner
numind/NuNER_Zero-spanEnglishSpan extractiongliner
ModelNotesBundle
jackboyla/glirel-large-v0Zero-shot relation extractiongliner
ModelApproachMax LengthBundle
knowledgator/gliclass-small-v1.0GLiClass512gliner
knowledgator/gliclass-base-v1.0GLiClass512gliner
MoritzLaurer/deberta-v3-base-zeroshot-v2.0NLI512default
MoritzLaurer/deberta-v3-large-zeroshot-v2.0NLI512default
ModelTasksBundle
microsoft/Florence-2-baseOCR, caption, detectionflorence2
microsoft/Florence-2-largeOCR, caption, detectionflorence2
microsoft/Florence-2-base-ftOCR, caption, detectionflorence2
mynkchaudhry/Florence-2-FT-DocVQADocument QAflorence2
naver-clova-ix/donut-base-finetuned-docvqaDocument QAflorence2
naver-clova-ix/donut-base-finetuned-cord-v2Receipt parsingflorence2
naver-clova-ix/donut-base-finetuned-rvlcdipDocument classificationflorence2
ModelNotesBundle
IDEA-Research/grounding-dino-tinySmaller, fasterdefault
IDEA-Research/grounding-dino-baseHigher qualitydefault
google/owlv2-base-patch16-ensembleOWL-ViT baseddefault

Models require specific bundles due to dependency conflicts:

BundleImage TagModels
defaultcuda12-defaultMost models (embeddings, rerankers, ColBERT, NLI classification)
glinercuda12-glinerGLiNER, GLiREL, GLiClass, NuNER models
sglangcuda12-sglangLLM-based models (e5-mistral-7b, Nemotron, SFR, etc.)
florence2cuda12-florence2Florence-2, Donut vision models

See Bundles for details.


You can programmatically query which models are available on a running SIE instance:

from sie_sdk import SIEClient
from sie_sdk.types import Item
client = SIEClient("http://localhost:8080")
# List available models
models = client.list_models()
for model in models:
print(f"{model.name}: {model.dims} dims, loaded={model.loaded}")

SIE can serve any HuggingFace model that fits an existing adapter. See Adding Models.