Why did we open-source our inference engine? Read the post

lightonai/GTE-ModernColBERT-v1

This is a PyLate model trained on the ms-marco-en-bge-gemma dataset. It maps sentences & paragraphs to sequences of 128-dimensional dense vectors and can be used for semantic textual similarity using the MaxSim operator.

Architecture
ModernBERT
Parameters
305M
Tasks
Encode
Outputs
Multi-Vec
Dimensions
Multi-Vec: 128
Max Sequence Length
8,192 tokens
License
apache-2.0

Benchmarks

CQADupstackPhysicsRetrieval

scientific retrieval en

Duplicate question retrieval from StackExchange Physics

Corpus: 38,314 Queries: 1,039
default_candidates-k-50_candidates-model-Alibaba-NLP__gte-multilingual-base
Performance L4-SPOT b1 c16
Corpus TPS 1.9K
Corpus p50 509.4ms
Query TPS 131
Query p50 573.4ms
Performance L4 b1 c16
Corpus TPS 1.9K
Corpus p50 509.4ms
Query TPS 131
Query p50 573.4ms
default
Quality
ndcg at 10 0.3886
map at 10 0.3410
mrr at 10 0.3904
Performance L4 b1 c16
Corpus TPS 21.7K
Corpus p50 88.1ms
Query TPS 2.5K
Query p50 68.2ms
Reference →

CosQA

technology retrieval en

Code search with natural language queries

Corpus: 6,267 Queries: 500
default_candidates-k-50_candidates-model-Alibaba-NLP__gte-multilingual-base
Performance L4-SPOT b1 c16
Corpus TPS 890
Corpus p50 454.2ms
Query TPS 75
Query p50 566.6ms
Performance L4 b1 c16
Corpus TPS 890
Corpus p50 454.2ms
Query TPS 75
Query p50 566.6ms
default
Quality
ndcg at 10 0.3126
map at 10 0.2347
mrr at 10 0.2366
Performance L4 b1 c16
Corpus TPS 7.4K
Corpus p50 84.4ms
Query TPS 462
Query p50 76.3ms
Reference →

FiQA2018

finance retrieval en

Financial opinion mining and question answering

Corpus: 57,599 Queries: 648
default_candidates-k-50_candidates-model-Alibaba-NLP__gte-multilingual-base
Performance L4-SPOT b1 c16
Corpus TPS 2.6K
Corpus p50 469.6ms
Query TPS 303
Query p50 278.2ms
Performance L4 b1 c16
Corpus TPS 2.6K
Corpus p50 469.6ms
Query TPS 303
Query p50 278.2ms
default
Quality
ndcg at 10 0.3838
map at 10 0.3133
mrr at 10 0.4648
Performance L4 b1 c16
Corpus TPS 18.9K
Corpus p50 106.6ms
Query TPS 2.4K
Query p50 71.9ms
Reference →

LegalBenchConsumerContractsQA

legal retrieval en

Question answering on consumer contracts

Corpus: 153 Queries: 396
default_candidates-k-50_candidates-model-Alibaba-NLP__gte-multilingual-base
Performance L4-SPOT b1 c16
Corpus TPS 6.2K
Corpus p50 532.8ms
Query TPS 278
Query p50 327.3ms
Performance L4 b1 c16
Corpus TPS 6.2K
Corpus p50 532.8ms
Query TPS 278
Query p50 327.3ms
default
Quality
ndcg at 10 0.7773
map at 10 0.7300
mrr at 10 0.7321
Performance L4 b1 c16
Corpus TPS 42.9K
Corpus p50 192.4ms
Query TPS 3.6K
Query p50 70.2ms
Reference →

NFCorpus

medical retrieval en

Biomedical literature search from NutritionFacts.org

Corpus: 3,593 Queries: 323
default_candidates-k-50_candidates-model-Alibaba-NLP__gte-multilingual-base
Performance L4-SPOT b1 c16
Corpus TPS 4.4K
Corpus p50 463.3ms
Query TPS 111
Query p50 299.7ms
Performance L4 b1 c16
Corpus TPS 4.4K
Corpus p50 463.3ms
Query TPS 111
Query p50 299.7ms
default
Quality
ndcg at 10 0.3616
map at 10 0.1390
mrr at 10 0.5824
Performance L4 b1 c16
Corpus TPS 35.9K
Corpus p50 101.3ms
Query TPS 1.7K
Query p50 45.7ms
Reference →

NanoFiQA2018Retrieval

finance retrieval en

Smaller subset of the FiQA financial QA dataset

Quality
ndcg at 10 0.5229
map at 10 0.4304
mrr at 10 0.5544
Reference →

SCIDOCS

scientific retrieval en

Citation prediction, document classification, and recommendation for scientific papers

Corpus: 25,656 Queries: 1,000
default_candidates-k-50_candidates-model-Alibaba-NLP__gte-multilingual-base
Performance L4-SPOT b1 c16
Corpus TPS 4.4K
Corpus p50 257.6ms
Query TPS 184
Query p50 327.2ms
Performance L4 b1 c16
Corpus TPS 4.4K
Corpus p50 257.6ms
Query TPS 184
Query p50 327.2ms
default
Quality
ndcg at 10 0.1607
map at 10 0.0934
mrr at 10 0.2874
Performance L4 b1 c16
Corpus TPS 30.1K
Corpus p50 96.3ms
Query TPS 2.1K
Query p50 68.6ms
Reference →

SciFact

scientific retrieval en

Scientific claim verification using research literature

Corpus: 5,183 Queries: 300
default_candidates-k-50_candidates-model-Alibaba-NLP__gte-multilingual-base
Performance L4-SPOT b1 c16
Corpus TPS 9.2K
Corpus p50 241.6ms
Query TPS 396
Query p50 265.9ms
Performance L4 b1 c16
Corpus TPS 9.2K
Corpus p50 241.6ms
Query TPS 396
Query p50 265.9ms
default
Quality
ndcg at 10 0.7326
map at 10 0.6940
mrr at 10 0.7090
Performance L4 b1 c16
Corpus TPS 31.9K
Corpus p50 118.1ms
Query TPS 3.4K
Query p50 75.1ms
Reference →

StackOverflowQA

technology retrieval en

Programming question answering from Stack Overflow

Corpus: 19,931 Queries: 1,994
default_candidates-k-50_candidates-model-Alibaba-NLP__gte-multilingual-base
Performance L4-SPOT b1 c16
Corpus TPS 3.8K
Corpus p50 458.1ms
Query TPS 9.2K
Query p50 222.9ms
Performance L4 b1 c16
Corpus TPS 3.8K
Corpus p50 458.1ms
Query TPS 9.2K
Query p50 222.9ms
default
Quality
ndcg at 10 0.5067
map at 10 0.4750
mrr at 10 0.4750
Performance L4 b1 c16
Corpus TPS 26.0K
Corpus p50 127.7ms
Query TPS 52.9K
Query p50 91.7ms
Reference →

Self-hosted inference for search & document processing

Cut API costs by 50x, boost quality with 85+ SOTA models, and keep your data in your own cloud.