vidore/colpali-v1.3-hf

> [!IMPORTANT] > This version of ColPali should be loaded with the `transformers 🤗` release, not with `colpali-engine`. > It was converted using the `convert_colpali_weights_to_hf.py` script > from the `vidore/colpali-v1.3-merged` checkpoint.

Architecture

PaliGemma

Parameters

3.0B

Tasks

Encode

Outputs

Multi-Vec

Dimensions

Multi-Vec: 128

Max Sequence Length

2,048 tokens

License

gemma

Languages

View on HuggingFace → Fine-tuned from vidore/colpaligemma-3b-pt-448-base

Benchmarks

Vidore3ComputerScienceRetrieval

technology retrieval en

Visual document retrieval on computer science papers and slides

Quality

ndcg at 10 0.7119

map at 10 0.5767

mrr at 10 0.8571

Performance L4 b1 c4

Corpus TPS 6

Corpus p50 613.4ms

Query TPS 170

Query p50 444.4ms

Reference →

Vidore3FinanceEnRetrieval

finance retrieval en

Visual document retrieval on financial reports

Quality

ndcg at 10 0.4660

map at 10 0.3570

mrr at 10 0.6032

Performance L4 b1 c4

Corpus TPS 5

Corpus p50 624.9ms

Query TPS 185

Query p50 434.0ms

Reference →

english

general retrieval en

Visual document retrieval on HR-related documents

Quality

ndcg at 10 0.5481

map at 10 0.4077

mrr at 10 0.6708

Performance L4 b1 c4

Corpus TPS 6

Corpus p50 629.8ms

Query TPS 184

Query p50 447.9ms

Reference →

Vidore3PharmaceuticalsRetrieval

medical retrieval en

Visual document retrieval on pharmaceutical documents

Quality

ndcg at 10 0.5786

map at 10 0.4646

mrr at 10 0.6847

Performance L4 b1 c4

Corpus TPS 6

Corpus p50 591.7ms

Query TPS 168

Query p50 403.3ms

Reference →

Benchmarks

Vidore3ComputerScienceRetrieval

Vidore3FinanceEnRetrieval

english

Vidore3PharmaceuticalsRetrieval

Self-hosted inference for search & document processing