CrewAI
The sie-crewai package provides CrewAI tools and embedders: SIERerankerTool for reranking, SIEExtractorTool for entity extraction, and SIESparseEmbedder for hybrid search.
Installation
Section titled “Installation”pip install sie-crewaiThis installs sie-sdk and crewai as dependencies.
Start the Server
Section titled “Start the Server”# Docker (recommended)docker run -p 8080:8080 ghcr.io/superlinked/sie:default
# Or with GPUdocker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:defaultEmbedders
Section titled “Embedders”SIE integrates with CrewAI through two embedding approaches:
- Dense embeddings - Use SIE’s OpenAI-compatible API with CrewAI’s built-in embedder config
- Sparse embeddings - Use
SIESparseEmbedderfor hybrid search workflows
Dense Embeddings
Section titled “Dense Embeddings”Configure CrewAI to use SIE’s OpenAI-compatible endpoint:
from crewai import Crew
crew = Crew( agents=[...], tasks=[...], embedder={ "provider": "openai", "config": { "api_base": "http://localhost:8080/v1", "model": "BAAI/bge-m3" } })Sparse Embeddings
Section titled “Sparse Embeddings”Use SIESparseEmbedder for sparse vectors in hybrid search:
from sie_crewai import SIESparseEmbedder
sparse_embedder = SIESparseEmbedder( base_url="http://localhost:8080", model="BAAI/bge-m3")
# Embed documentssparse_vectors = sparse_embedder.embed_documents([ "Machine learning uses algorithms to learn from data.", "The weather is sunny today."])print(sparse_vectors[0].keys()) # dict_keys(['indices', 'values'])
# Embed a query (uses is_query=True for asymmetric models)query_vector = sparse_embedder.embed_query("What is machine learning?")Full Example
Section titled “Full Example”Complete example using SIE embeddings with a CrewAI agent for hybrid search:
from crewai import Agent, Crew, Taskfrom sie_crewai import SIESparseEmbedder
# 1. Configure dense embeddings via OpenAI-compatible APIembedder_config = { "provider": "openai", "config": { "api_base": "http://localhost:8080/v1", "model": "BAAI/bge-m3" }}
# 2. Set up sparse embedder for hybrid searchsparse_embedder = SIESparseEmbedder( base_url="http://localhost:8080", model="BAAI/bge-m3")
# 3. Prepare your corpus with both dense and sparse embeddingscorpus = [ "Machine learning is a branch of artificial intelligence.", "Neural networks are inspired by biological neurons.", "Deep learning uses multiple layers of neural networks.",]
# Get sparse embeddings for your vector databasesparse_vectors = sparse_embedder.embed_documents(corpus)# Store sparse_vectors in your vector DB (Qdrant, Weaviate, etc.)
# 4. Create a research agentresearcher = Agent( role="Research Analyst", goal="Find and analyze information from the knowledge base", backstory="Expert at finding relevant information using semantic search.", verbose=True)
# 5. Define the research taskresearch_task = Task( description="Search the knowledge base for information about deep learning.", expected_output="A summary of findings about deep learning.", agent=researcher)
# 6. Create and run the crewcrew = Crew( agents=[researcher], tasks=[research_task], embedder=embedder_config, verbose=True)
result = crew.kickoff()print(result)Reranker Tool
Section titled “Reranker Tool”SIERerankerTool is a CrewAI BaseTool that reranks documents by relevance to a query. Agents can use it to improve search quality.
from crewai import Agent, Crew, Taskfrom sie_crewai import SIERerankerTool
reranker = SIERerankerTool( base_url="http://localhost:8080", model="jinaai/jina-reranker-v2-base-multilingual",)
researcher = Agent( role="Research Analyst", goal="Find the most relevant information", tools=[reranker],)
task = Task( description="Rerank these documents for the query 'What is deep learning?'", expected_output="The most relevant documents.", agent=researcher,)
crew = Crew(agents=[researcher], tasks=[task])result = crew.kickoff()Extractor Tool
Section titled “Extractor Tool”SIEExtractorTool is a CrewAI BaseTool that extracts named entities from text using GLiNER models.
from crewai import Agent, Crew, Taskfrom sie_crewai import SIEExtractorTool
extractor = SIEExtractorTool( base_url="http://localhost:8080", model="urchade/gliner_multi-v2.1", labels=["person", "organization", "location"],)
analyst = Agent( role="Data Analyst", goal="Extract key entities from documents", tools=[extractor],)
task = Task( description="Extract all people, organizations, and locations from: 'Tim Cook announced new products at Apple Park in Cupertino.'", expected_output="A list of extracted entities.", agent=analyst,)
crew = Crew(agents=[analyst], tasks=[task])result = crew.kickoff()Configuration Options
Section titled “Configuration Options”SIESparseEmbedder
Section titled “SIESparseEmbedder”| Parameter | Type | Default | Description |
|---|---|---|---|
base_url | str | http://localhost:8080 | SIE server URL |
model | str | BAAI/bge-m3 | Model to use for sparse embeddings |
gpu | str | None | Target GPU type for routing |
options | dict | None | Model-specific options |
timeout_s | float | 180.0 | Request timeout in seconds |
SIERerankerTool
Section titled “SIERerankerTool”| Parameter | Type | Default | Description |
|---|---|---|---|
base_url | str | http://localhost:8080 | SIE server URL |
model | str | jinaai/jina-reranker-v2-base-multilingual | Reranker model |
gpu | str | None | Target GPU type for routing |
options | dict | None | Model-specific options |
timeout_s | float | 180.0 | Request timeout in seconds |
SIEExtractorTool
Section titled “SIEExtractorTool”| Parameter | Type | Default | Description |
|---|---|---|---|
base_url | str | http://localhost:8080 | SIE server URL |
model | str | urchade/gliner_multi-v2.1 | Extraction model |
labels | list[str] | ["person", "organization", "location"] | Default entity labels |
gpu | str | None | Target GPU type for routing |
options | dict | None | Model-specific options |
timeout_s | float | 180.0 | Request timeout in seconds |
What’s Next
Section titled “What’s Next”- Encode Text - dense and sparse embedding details
- Score / Rerank - reranking details
- Extract Entities - extraction details
- Model Catalog - all supported models
- Troubleshooting - common errors and solutions