Why did we open-source our inference engine? Read the post

Qwen/Qwen3-VL-Reranker-2B

The Qwen3-VL-Embedding and Qwen3-VL-Reranker model series are the latest additions to the Qwen family, built upon the recently open-sourced and powerful Qwen3-VL foundation model.

Overview

Architecture
qwen3_vl
Parameters
2.1B
Tasks
Score
Outputs
Score
Max Sequence Length
32,768 tokens
License
apache-2.0

Benchmarks

AskUbuntuDupQuestions

technology reranking en

Duplicate question detection from AskUbuntu

Corpus: 6,743 Queries: 360
Quality
ndcg at 10 0.6553
map at 10 0.5009
mrr at 10 0.7718
Performance L4 b1 c4
Reference →

Open source inference for agents

Open-source inference for the models behind your agents. Run it yourself, or let us run it for you.

Github 2.1K

Contact us

Tell us about your use case and we'll get back to you shortly.

Apply for an inference grant

Free capacity on our hosted cluster for selected projects. Tell us what you run and we reply by email.