SIE supports 80+ encode models across dense, sparse, multi-vector, and multimodal categories. Model performance varies by task. Run mise run eval <model> -t <task> to benchmark on your data.
Model Dims Max Length Languages Notes BAAI/bge-m31024 8192 100+ Also supports sparse, multivector Alibaba-NLP/gte-Qwen2-1.5B-instruct1536 32768 Multilingual Long context, instruction Alibaba-NLP/gte-Qwen2-7B-instruct3584 32000 Multilingual Largest quality Alibaba-NLP/gte-multilingual-base768 8192 50+ Efficient multilingual NovaSearch/stella_en_400M_v51024 512 English Balanced NovaSearch/stella_en_1.5B_v51024 512 English High quality
Model Dims Max Length Notes intfloat/e5-small-v2384 512 Fast, small intfloat/e5-base-v2768 512 Balanced intfloat/e5-large-v21024 512 High quality intfloat/multilingual-e5-large1024 512 Multilingual intfloat/multilingual-e5-large-instruct1024 512 Instruction-tuned intfloat/e5-mistral-7b-instruct4096 4096 LLM-based
Model Dims Max Length Notes sentence-transformers/all-MiniLM-L6-v2384 256 Fast baseline
Model Dims Max Length Notes nvidia/NV-Embed-v24096 32768 NVIDIA optimized nvidia/llama-embed-nemotron-8b4096 8192 LLM-based Salesforce/SFR-Embedding-Mistral4096 4096 Salesforce Salesforce/SFR-Embedding-2_R4096 8192 Latest version GritLM/GritLM-7B4096 8192 Generative + embedding Linq-AI-Research/Linq-Embed-Mistral4096 32768 Long context google/embeddinggemma-300m768 2048 Gemma-based
Model Dims Max Length Notes Qwen/Qwen3-Embedding-0.6B1024 32768 Small, fast Qwen/Qwen3-Embedding-4B2560 32768 High quality
Model Vocab Size Max Length Notes BAAI/bge-m3250002 8192 Multi-output (also dense) naver/splade-v330522 512 High-quality sparse naver/splade-cocondenser-selfdistil30522 512 Balanced prithivida/Splade_PP_en_v230522 256 English rasyosef/splade-mini30522 128 Small ibm-granite/granite-embedding-30m-sparse30522 512 IBM
Model Notes opensearch-project/opensearch-neural-sparse-encoding-v1Original opensearch-project/opensearch-neural-sparse-encoding-v2-distillDistilled opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distillDocument-side opensearch-project/opensearch-neural-sparse-encoding-doc-v2-miniSmall opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distillV3 distilled opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gteGTE-based
Model Token Dim Max Length Notes jinaai/jina-colbert-v2128 8192 Long context answerdotai/answerai-colbert-small-v1128 512 Fast, small colbert-ir/colbertv2.0128 512 Original ColBERT mixedbread-ai/mxbai-colbert-large-v11024 512 Large dimension mixedbread-ai/mxbai-edge-colbert-v0-32m128 512 Edge/mobile lightonai/GTE-ModernColBERT-v1128 8192 Modern architecture lightonai/Reason-ModernColBERT128 8192 Reasoning-focused nvidia/llama-nemoretriever-colembed-3b-v11024 512 NVIDIA
Model Dims Resolution Notes openai/clip-vit-base-patch32512 224 Fast baseline openai/clip-vit-large-patch14768 224 Higher quality laion/CLIP-ViT-B-32-laion2B-s34B-b79K512 224 LAION trained laion/CLIP-ViT-H-14-laion2B-s32B-b79K1024 224 Large
Model Dims Resolution Notes google/siglip-so400m-patch14-2241152 224 Fast google/siglip-so400m-patch14-3841152 384 Higher resolution
Model Token Dim Resolution Notes vidore/colpali-v1.3-hf128 1024 Document pages vidore/colqwen2.5-v0.2128 1024 Qwen-based
Models are grouped into bundles based on dependency compatibility:
Bundle Models Notes defaultMost models Standard dependencies legacyOlder transformers Compatibility mode gte-qwen2GTE-Qwen2 models Qwen dependencies sglangLLM-based models SGLang runtime florence2Florence-2 Vision dependencies
Start with a specific bundle:
docker run -p 8080:8080 ghcr.io/superlinked/sie:default
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:default
from sie_sdk import SIEClient
from sie_sdk.types import Item
client = SIEClient( "http://localhost:8080" )
models = client.list_models()
print ( f " { model.name } : { model.dims } dims, loaded= { model.loaded } " )
# Use any model from the catalog
result = client.encode( "BAAI/bge-m3" , Item( text = "Hello world" ))
import { SIEClient } from "@sie/sdk" ;
const client = new SIEClient ( "http://localhost:8080" );
const models = await client. listModels ();
for ( const model of models) {
console. log ( `${ model . name }: ${ model . dims ?. dense } dims, loaded=${ model . loaded }` );
// Use any model from the catalog
const result = await client. encode ( "BAAI/bge-m3" , { text: "Hello world" });
See Adding Models for configuring new models.
Evals - benchmark models on your tasks