Self-hosted inference
for search & document processing
# Configure module "sie" { source = "superlinked/sie/aws" region = "us-east-1" gpus = ["a100-40gb", "l4-spot"] } # Deploy terraform apply helm install sie ghcr.io/superlinked/charts/sie # Use curl $URL/v1/encode/bge-m3?lora=legal \ -d '{"text": "indemnification clause"}'
# Configure module "sie" { source = "superlinked/sie/gcp" region = "us-east-1" gpus = ["a100-40gb", "l4-spot"] } # Deploy terraform apply helm install sie ghcr.io/superlinked/charts/sie # Use curl $URL/v1/encode/bge-m3?lora=legal \ -d '{"text": "indemnification clause"}'
# Configure module "sie" { source = "superlinked/sie/local" region = "us-east-1" gpus = ["a100-40gb", "l4-spot"] } # Deploy terraform apply helm install sie ghcr.io/superlinked/charts/sie # Use curl $URL/v1/encode/bge-m3?lora=legal \ -d '{"text": "indemnification clause"}'
Works with your favorite tools
Browse integrations"Placeholder partner quote about advantages and benefits of our inference product."
"Placeholder partner quote about advantages and benefits of our inference product."
"Placeholder partner quote about advantages and benefits of our inference product."
"Placeholder partner quote about advantages and benefits of our inference product."
Benefits of self-hosted inference
Pay for your own GPUs instead of per-token API pricing. Improve GPU utilization and stability vs. custom TEI/Infinity deployments.
Boost accuracy with latest task-specific open source models. Embeddings, rerankers, extraction — including multi-modal and multi-vector.
Data never leaves your AWS/GCP. You pick models and configurations. SOC2 Type2 certified. Apache 2.0 licensed.
Learn from our example apps
Browse examples
SIE: Superlinked Inference Engine
Run all your Search & Document processing inference in one centralized cluster across teams and workloads.
Build your apps
> pip install sie-sdk > npm install @sie/sdk
and 5+ framework integrations
Manage models & configurations via code
config.update()
Deploy the cluster
> helm install sie
ghcr.io/superlinked/
charts/sie
Observe with cloud-native tools, grafana and
> sie-admin top
Create the infrastructure
module "sie" { source = "superlinked/sie/aws" region = "us-east-1" gpus = ["a100-40gb", "A10-spot"] }
Deploy
> terraform apply
Plan your self-deployment
How SIE fits in your stack
See where SIE sits in a typical retrieval pipeline alongside vector databases, orchestration frameworks, and your application layer.
Cost Comparison
Compare across models, GPU types, and cloud providers.