Why did we open-source our inference engine? Read the post

cross-encoder/ms-marco-MiniLM-L-6-v2

This model was trained on the MS Marco Passage Ranking task.

Overview

Architecture
BERT
Parameters
23M
Tasks
Score
Outputs
Score
Max Sequence Length
512 tokens
License
apache-2.0
Languages
en

Benchmarks

AskUbuntuDupQuestions

technology reranking en

Duplicate question detection from AskUbuntu

Corpus: 6,743 Queries: 360
Quality
ndcg at 10 0.6027
map at 10 0.4439
mrr at 10 0.6776
Performance L4 b1 c16
Query 948 tok/s
Query p50 362.8ms
Reference →

CMedQAv1Reranking

medical reranking zh

Chinese medical question answering reranking (v1)

Corpus: 100,000 Queries: 2,000
Quality
map at 10 0.0835
mrr at 10 0.1371
Reference →

CMedQAv2Reranking

medical reranking zh

Chinese medical question answering reranking (v2)

Corpus: 108,000 Queries: 4,000
Quality
map at 10 0.0926
mrr at 10 0.1425
Reference →

CQADupstackPhysicsRetrieval?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 44.3K tok/s
Query p50 44.6ms

CosQA?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 20.5K tok/s
Query p50 43.6ms

FiQA2018?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 51.1K tok/s
Query p50 43.4ms

LegalBenchConsumerContractsQA?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 91.7K tok/s
Query p50 45.6ms

MMarcoReranking

general reranking zh

Multilingual MARCO passage reranking (Chinese)

Quality
map at 10 0.0543
mrr at 10 0.0544
Performance L4 b1 c16
Reference →

NFCorpus?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 70.8K tok/s
Query p50 45.9ms

NanoFiQA2018Retrieval

finance retrieval en

Smaller subset of the FiQA financial QA dataset

Performance L4 b1 c16
Query 7.5K tok/s
Query p50 388.1ms
Reference →

SCIDOCS?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 53.7K tok/s
Query p50 42.5ms

SciFact?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 67.4K tok/s
Query p50 42.1ms

StackOverflowQA?candidates_model=Alibaba-NLP

general retrieval en

Performance L4 b1 c16
Query 98.6K tok/s
Query p50 47.2ms

T2Reranking

general reranking zh

Chinese passage ranking benchmark

Quality
map at 10 0.4714
mrr at 10 0.7102
Reference →

Open source inference for agents

Open-source inference for the models behind your agents. Run it yourself, or let us run it for you.

Github 2.1K

Contact us

Tell us about your use case and we'll get back to you shortly.

Apply for an inference grant

Free capacity on our hosted cluster for selected projects. Tell us what you run and we reply by email.