Why did we open-source our inference engine? Read the post

opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini

The model should be selected considering search relevance, model inference and retrieval efficiency(FLOPS). We benchmark models' zero-shot performance on a subset of BEIR benchmark: TrecCovid,NFCorpus,NQ,HotpotQA,FiQA,ArguAna,Touche,DBPedia,SCIDOCS,FEVER,Climate FEVER,SciFact,Quora.

Overview

Architecture
BERT
Parameters
23M
Tasks
Encode
Outputs
Sparse
Dimensions
Sparse: 30,522
Max Sequence Length
512 tokens
License
apache-2.0
Languages
en

Benchmarks

CQADupstackPhysicsRetrieval

scientific retrieval en

Duplicate question retrieval from StackExchange Physics

Corpus: 38,314 Queries: 1,039
Performance L4 b1 c16
Corpus 36.4K tok/s
Corpus p50 49.4ms
Query 4.6K tok/s
Query p50 37.1ms
Reference →

CosQA

technology retrieval en

Code search with natural language queries

Corpus: 6,267 Queries: 500
Performance L4 b1 c16
Corpus 15.0K tok/s
Corpus p50 48.7ms
Query 2.3K tok/s
Query p50 40.4ms
Reference →

FiQA2018

finance retrieval en

Financial opinion mining and question answering

Corpus: 57,599 Queries: 648
Performance L4 b1 c16
Corpus 39.9K tok/s
Corpus p50 54.3ms
Query 4.7K tok/s
Query p50 38.5ms
Reference →

LegalBenchConsumerContractsQA

legal retrieval en

Question answering on consumer contracts

Corpus: 153 Queries: 396
Performance L4 b1 c16
Corpus 110.6K tok/s
Corpus p50 61.1ms
Query 7.0K tok/s
Query p50 36.7ms
Reference →

NFCorpus

medical retrieval en

Biomedical literature search from NutritionFacts.org

Corpus: 3,593 Queries: 323
Quality
ndcg at 10 0.3267
map at 10 0.1263
mrr at 10 0.5384
Performance L4 b1 c16
Corpus 61.6K tok/s
Corpus p50 67.8ms
Query 1.8K tok/s
Query p50 42.1ms
Reference →

SCIDOCS

scientific retrieval en

Citation prediction, document classification, and recommendation for scientific papers

Corpus: 25,656 Queries: 1,000
Performance L4 b1 c16
Corpus 48.4K tok/s
Corpus p50 53.4ms
Query 4.6K tok/s
Query p50 38.3ms
Reference →

SciFact

scientific retrieval en

Scientific claim verification using research literature

Corpus: 5,183 Queries: 300
Performance L4 b1 c16
Corpus 63.3K tok/s
Corpus p50 57.3ms
Query 6.8K tok/s
Query p50 38.1ms
Reference →

StackOverflowQA

technology retrieval en

Programming question answering from Stack Overflow

Corpus: 19,931 Queries: 1,994
Performance L4 b1 c16
Corpus 53.8K tok/s
Corpus p50 54.7ms
Query 97.8K tok/s
Query p50 45.7ms
Reference →

Open source inference for agents

Open-source inference for the models behind your agents. Run it yourself, or let us run it for you.

Github 2.1K

Contact us

Tell us about your use case and we'll get back to you shortly.

Apply for an inference grant

Free capacity on our hosted cluster for selected projects. Tell us what you run and we reply by email.