Why did we open-source our inference engine? Read the post

Qwen/Qwen3-Reranker-4B

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B).

Overview

Architecture
Qwen3
Parameters
4.0B
Tasks
Score
Outputs
Score
Max Sequence Length
32,768 tokens
License
apache-2.0

Benchmarks

AskUbuntuDupQuestions

technology reranking en

Duplicate question detection from AskUbuntu

Corpus: 6,743 Queries: 360
Quality
ndcg at 10 0.6953
map at 10 0.5480
mrr at 10 0.7743
Performance L4 b1 c16
Reference →

Open source inference for agents

Open-source inference for the models behind your agents. Run it yourself, or let us run it for you.

Github 2.1K

Contact us

Tell us about your use case and we'll get back to you shortly.

Apply for an inference grant

Free capacity on our hosted cluster for selected projects. Tell us what you run and we reply by email.