Why did we open-source our inference engine? Read the post

google/siglip2-base-patch16-224

SigLIP 2 extends the pretraining objective of SigLIP with prior, independently developed techniques into a unified recipe, for improved semantic understanding, localization, and dense features.

Overview

Architecture
SigLIP
Parameters
375M
Tasks
Encode
Outputs
Dense
Dimensions
Dense: 768
Max Sequence Length
64 tokens
License
apache-2.0

Benchmarks

Flickr30kI2TRetrieval

general retrieval en

Image-to-text retrieval: retrieve captions from images

Corpus: 31,783 Queries: 1,000
Quality
ndcg at 10 0.8157
map at 10 0.7255
mrr at 10 0.9302
Performance L4 b1 c8
Corpus 1.6K tok/s
Corpus p50 68.5ms
Query 13.0 mpix/s
Query p50 99.0ms
Reference →

Open source inference for agents

Open-source inference for the models behind your agents. Run it yourself, or let us run it for you.

Github 2.1K

Contact us

Tell us about your use case and we'll get back to you shortly.

Apply for an inference grant

Free capacity on our hosted cluster for selected projects. Tell us what you run and we reply by email.