Skip to content
SIE

Overview

Dense embeddings are fixed-dimension float vectors that capture semantic meaning. They power similarity search, RAG pipelines, and recommendation systems.

from sie_sdk import SIEClient
from sie_sdk.types import Item
client = SIEClient("http://localhost:8080")
result = client.encode("BAAI/bge-m3", Item(text="Your text here"))
print(f"Dimensions: {len(result['dense'])}") # 1024

Use dense embeddings when:

  • You need semantic similarity (not exact keyword matching)
  • Your vector database supports dense vectors (most do)
  • Storage is not extremely constrained

Consider alternatives when:

Pass a single Item to get a single result:

result = client.encode("BAAI/bge-m3", Item(text="Hello world"))
print(result["dense"][:5]) # First 5 dimensions
# [0.0234, -0.0891, 0.1234, ...]

Pass a list of items for efficient batch processing:

items = [
Item(text="First document"),
Item(text="Second document"),
Item(text="Third document"),
]
results = client.encode("BAAI/bge-m3", items)
for i, result in enumerate(results):
print(f"Doc {i}: {len(result['dense'])} dimensions")

The server batches requests automatically for GPU efficiency.

Track which result corresponds to which input:

items = [
Item(id="doc-1", text="First document"),
Item(id="doc-2", text="Second document"),
]
results = client.encode("BAAI/bge-m3", items)
for result in results:
print(f"{result['id']}: {len(result['dense'])} dims")

Many models perform better when you distinguish queries from documents. Queries are short questions. Documents are the content you search over.

For asymmetric models, set is_query=True:

# Encode query (short, question-like)
query = client.encode(
"BAAI/bge-m3",
Item(text="What is machine learning?"),
is_query=True,
)
# Encode documents (longer, content)
documents = client.encode(
"BAAI/bge-m3",
[Item(text="Machine learning is..."), Item(text="Deep learning uses...")],
)

Some models accept explicit instructions:

result = client.encode(
"Alibaba-NLP/gte-Qwen2-1.5B-instruct",
Item(text="What is Python?"),
instruction="Represent this query for retrieving programming tutorials:"
)

By default, encode returns dense embeddings. Request multiple output types:

# Dense only (default)
result = client.encode("BAAI/bge-m3", Item(text="text"))
print(result["dense"]) # numpy array
# Multiple outputs
result = client.encode(
"BAAI/bge-m3",
Item(text="text"),
output_types=["dense", "sparse", "multivector"]
)
print(result["dense"]) # 1024-dim float array
print(result["sparse"]) # {"indices": [...], "values": [...]}
print(result["multivector"]) # [num_tokens, 1024] array

Not all models support all output types. BGE-M3 supports all three. Most models support only dense.

The EncodeResult is a TypedDict containing:

FieldTypeDescription
idstr | NoneItem ID if provided
denseNDArray[float32]Dense embedding vector
sparseSparseResult | NoneSparse indices and values
multivectorNDArray[float32] | NonePer-token embeddings
timingTimingInfoRequest timing breakdown
result = client.encode("BAAI/bge-m3", Item(text="text"))
# Access fields (TypedDict syntax)
embedding = result["dense"] # numpy array
dimensions = len(result["dense"]) # e.g., 1024

These models work well for general-purpose embedding. Run mise run eval --print for benchmark data, or see the full catalog.

ModelDimsMax LengthNotes
BAAI/bge-m310248192Multilingual, also supports sparse and multivector
intfloat/e5-base-v2768512Balanced quality and speed
sentence-transformers/all-MiniLM-L6-v2384256Fast, lightweight

Models perform differently on different tasks. Identify a benchmark task similar to your problem, or create custom eval tasks. See Evals.

The server defaults to msgpack for efficient numpy array transport. To use JSON, set the Accept header:

Terminal window
curl -X POST http://localhost:8080/v1/encode/BAAI/bge-m3 \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-d '{"items": [{"text": "Your text here"}]}'