Skip to content
SIE

Chroma

The sie-chroma package (Python) and @sie/chroma package (TypeScript) provide embedding functions for ChromaDB. Use SIEEmbeddingFunction for dense embeddings in standard collections. Use SIESparseEmbeddingFunction for hybrid search on Chroma Cloud.

Terminal window
pip install sie-chroma

This installs sie-sdk and chromadb as dependencies.

Terminal window
# Docker (recommended)
docker run -p 8080:8080 ghcr.io/superlinked/sie:latest
# Or with GPU
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:latest

SIEEmbeddingFunction implements ChromaDB’s EmbeddingFunction protocol. Use it when creating or querying collections.

from sie_chroma import SIEEmbeddingFunction
embedding_function = SIEEmbeddingFunction(
base_url="http://localhost:8080",
model="BAAI/bge-m3",
)
ParameterTypeDefaultDescription
base_urlstrhttp://localhost:8080SIE server URL
modelstrBAAI/bge-m3Model to use for embeddings
gpustrNoneTarget GPU type for routing
optionsdictNoneModel-specific options
timeout_sfloat180.0Request timeout in seconds

Create a ChromaDB collection with SIE embeddings and perform similarity search:

import chromadb
from sie_chroma import SIEEmbeddingFunction
# Initialize the embedding function
embedding_function = SIEEmbeddingFunction(
base_url="http://localhost:8080",
model="BAAI/bge-m3",
)
# Create a Chroma client and collection
client = chromadb.Client()
collection = client.create_collection(
name="documents",
embedding_function=embedding_function,
)
# Add documents
collection.add(
documents=[
"Machine learning is a subset of artificial intelligence.",
"Neural networks are inspired by biological neurons.",
"Deep learning uses multiple layers of neural networks.",
"Python is popular for machine learning development.",
],
ids=["doc1", "doc2", "doc3", "doc4"],
)
# Query the collection
results = collection.query(
query_texts=["What is deep learning?"],
n_results=2,
)
for doc, distance in zip(results["documents"][0], results["distances"][0]):
print(f"{distance:.4f}: {doc}")
import chromadb
from sie_chroma import SIEEmbeddingFunction
embedding_function = SIEEmbeddingFunction(model="BAAI/bge-m3")
# Use persistent storage
client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection(
name="my_collection",
embedding_function=embedding_function,
)

SIESparseEmbeddingFunction generates sparse embeddings for Chroma Cloud hybrid search. Use it with SparseVectorIndexConfig.

from sie_chroma import SIESparseEmbeddingFunction
sparse_ef = SIESparseEmbeddingFunction(
base_url="http://localhost:8080",
model="BAAI/bge-m3",
)

The sparse embedding function returns dict[int, float] mappings of token indices to weights. This format is compatible with Chroma Cloud’s hybrid search feature.