Why did we open-source our inference engine? Read the post

← Catalog

google/owlv2-large-patch14-ensemble

Open comparison →

Primitive: /extract · Extract · OWLv2

The OWLv2 model (short for Open-World Localization) was proposed in Scaling Open-Vocabulary Object Detection by Matthias Minderer, Alexey Gritsenko, Neil Houlsby.

MultimodalBounding boxes

Overview

Hardware: — drives latency, throughput & cost

Size438M params
Tasks /extract
Licenseapache-2.0
Latency
Throughput
Cost /1M tok

Cost is approximate — computed from list GPU prices; your actual price depends on the provider you deploy SIE with.

Extraction

Output kindsBounding Boxes
Inputsimage
Max sequence length

Benchmarks

COCO

general detection en

Object detection on COCO natural images

Corpus: 5,000 Queries: 5,000
Quality
ap 0.4279
ap50 0.6309
ap75 0.4705
ar 100 0.6087
Reference →

Open source inference for agents

Open-source inference for the models behind your agents. Run it yourself, or let us run it for you.

Github 2.1K

Contact us

Tell us about your use case and we'll get back to you shortly.

Apply for an inference grant

Free capacity on our hosted cluster for selected projects. Tell us what you run and we reply by email.