Why did we open-source our inference engine? Read the post

vidore/colpali-v1.3-hf

> [!IMPORTANT] > This version of ColPali should be loaded with the `transformers 🤗` release, not with `colpali-engine`. > It was converted using the `convert_colpali_weights_to_hf.py` script > from the `vidore/colpali-v1.3-merged` checkpoint.

Architecture
PaliGemma
Parameters
3.0B
Tasks
Encode
Outputs
Multi-Vec
Dimensions
Multi-Vec: 128
Max Sequence Length
2,048 tokens
License
gemma
Languages
en

Benchmarks

Vidore3ComputerScienceRetrieval

technology retrieval en

Visual document retrieval on computer science papers and slides

Quality
ndcg at 10 0.7119
map at 10 0.5767
mrr at 10 0.8571
Performance L4 b1 c4
Corpus TPS 6
Corpus p50 613.4ms
Query TPS 170
Query p50 444.4ms
Reference →

Vidore3FinanceEnRetrieval

finance retrieval en

Visual document retrieval on financial reports

Quality
ndcg at 10 0.4660
map at 10 0.3570
mrr at 10 0.6032
Performance L4 b1 c4
Corpus TPS 5
Corpus p50 624.9ms
Query TPS 185
Query p50 434.0ms
Reference →

english

general retrieval en

Visual document retrieval on HR-related documents

Quality
ndcg at 10 0.5481
map at 10 0.4077
mrr at 10 0.6708
Performance L4 b1 c4
Corpus TPS 6
Corpus p50 629.8ms
Query TPS 184
Query p50 447.9ms
Reference →

Vidore3PharmaceuticalsRetrieval

medical retrieval en

Visual document retrieval on pharmaceutical documents

Quality
ndcg at 10 0.5786
map at 10 0.4646
mrr at 10 0.6847
Performance L4 b1 c4
Corpus TPS 6
Corpus p50 591.7ms
Query TPS 168
Query p50 403.3ms
Reference →

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.