---
title: OpenAI → SIE
description: Migrate from OpenAI Embeddings to a self-hosted SIE cluster. Drop-in via the OpenAI-compatible endpoint, or use the native SIE SDK.
canonical_url: https://superlinked.com/docs/migrate/openai
last_updated: 2026-05-18
---

OpenAI Embeddings is a paid managed API. SIE is a self-hosted inference
engine. There are two migration paths:

1. **Drop-in shim.** Point the OpenAI SDK at SIE's `/v1/embeddings`
   endpoint. Two-line change, every other call site untouched.
2. **Native SDK.** Use `sie_sdk.SIEClient` directly to access sparse,
   multivector, ColBERT, rerankers, and extraction.

## Why migrate

- **Cost crosses over at moderate volume.** OpenAI bills per token;
  SIE has flat hourly cost. The right answer depends on your workload;
  see the [worked example](#worked-cost-example) below. Plug in your
  own numbers before quoting either side.
- **Data residency.** Embeddings of your text never leave your network.
- **No rate limits.** Your ceiling is whatever GPU capacity you
  provision. You also own the SLA: OpenAI's published 99.9% becomes
  whatever your platform team operates.
- **Model breadth.** OpenAI ships 3 embedding models. SIE serves 100+
  out of the box (109 bundle configs as of writing) across dense,
  sparse, ColBERT/multivector, vision, and rerankers.

## TL;DR

#### Python

```diff
- client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
+ client = OpenAI(api_key="not-needed", base_url="http://sie:8080/v1")
```

#### TypeScript

```diff
- const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
+ const client = new OpenAI({ apiKey: "not-needed", baseURL: "http://sie:8080/v1" });
```

…and `model: "text-embedding-3-small"` becomes `model: "intfloat/e5-base-v2"` (or
whichever SIE model you pick).

## Before

#### Python

```python
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
resp = client.embeddings.create(
    model="text-embedding-3-small",
    input=["The mitochondrion is the powerhouse of the cell."],
)
vector = resp.data[0].embedding  # 1536-dim
```

#### TypeScript

```typescript
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const resp = await client.embeddings.create({
  model: "text-embedding-3-small",
  input: ["The mitochondrion is the powerhouse of the cell."],
});
const vector = resp.data[0].embedding; // 1536-dim
```

## After

#### Drop-in (OpenAI SDK + SIE)

<TabItem label="Python">
  ```python
  from openai import OpenAI

  client = OpenAI(api_key="not-needed", base_url="http://localhost:8080/v1")
  resp = client.embeddings.create(
      model="intfloat/e5-base-v2",
      input=["The mitochondrion is the powerhouse of the cell."],
  )
  vector = resp.data[0].embedding  # 768-dim
  ```

#### TypeScript

```typescript
import OpenAI from "openai";

const client = new OpenAI({ apiKey: "not-needed", baseURL: "http://localhost:8080/v1" });
const resp = await client.embeddings.create({
  model: "intfloat/e5-base-v2",
  input: ["The mitochondrion is the powerhouse of the cell."],
});
const vector = resp.data[0].embedding; // 768-dim
```

  </TabItem>

#### Native SIE SDK

<TabItem label="Python">
  ```python
  from sie_sdk import SIEClient
  from sie_sdk.types import Item

  client = SIEClient("http://localhost:8080")
  result = client.encode(
      "intfloat/e5-base-v2",
      Item(text="The mitochondrion is the powerhouse of the cell."),
  )
  vector = result["dense"].tolist()
  ```

#### TypeScript

```typescript
import { SIEClient } from "@superlinked/sie-sdk";

const client = new SIEClient("http://localhost:8080");
const result = await client.encode(
  "intfloat/e5-base-v2",
  { text: "The mitochondrion is the powerhouse of the cell." },
);
const vector = Array.from(result.dense ?? []);

await client.close();
```

  </TabItem>

## Mapping

| OpenAI                                | SIE equivalent                                                   |
|---------------------------------------|------------------------------------------------------------------|
| `text-embedding-3-small` (1536)       | `intfloat/e5-base-v2` (768)                                      |
| `text-embedding-3-large` (3072)       | `Alibaba-NLP/gte-Qwen2-1.5B-instruct` or `NovaSearch/stella_en_1.5B_v5` |
| `dimensions=N` truncation             | Slice client-side, or pick a smaller-dim model                   |
| `encoding_format="base64"`            | Supported on `/v1/embeddings` with the same field                |
| `user="..."` for abuse tracking       | Use SIE telemetry / your own tracing                             |

## Worked cost example

Public list prices, no negotiated discounts, single region:

| Workload          | OpenAI (`text-embedding-3-small` @ $0.02 / 1M tokens) | SIE on one `g5.xlarge` (1× A10G, ~$1.00/hr on-demand) |
|-------------------|-------------------------------------------------------|--------------------------------------------------------|
| 10M tokens / day  | ~$0.20 / day · ~$6 / month                            | ~$24 / day · ~$730 / month                             |
| 100M tokens / day | ~$2 / day · ~$60 / month                              | ~$24 / day · ~$730 / month                             |
| 1B tokens / day   | ~$20 / day · ~$600 / month                            | ~$24 / day · ~$730 / month                             |
| 5B tokens / day   | ~$100 / day · ~$3,000 / month                         | needs ≥1 GPU; still ~$730–$2,200 / month               |

Crossover sits around 1.2B tokens / day on this size class. Below that, OpenAI is cheaper *and* you don't run a
GPU. Above that, SIE wins on cost and the gap widens linearly. **None
of this counts the engineering cost of operating a GPU pool**, which
is the actually-load-bearing variable for most teams. Plug in your
own utilization, GPU size, and reserved-instance discount before
quoting a number to your manager.

## Re-embed required?

**Yes.** Different model → different vector space. Even
`text-embedding-3-small` truncated to 1024 dims is not interchangeable
with any open-source model. Plan a re-embed window before cutting over.

## Run it yourself

```bash
# Start SIE with E5 loaded.
mise run serve -- -m intfloat/e5-base-v2

# Run the OpenAI 'before' script and the SIE 'after' script
# from this page. Compare the printed embeddings.
export OPENAI_API_KEY=sk-...
uv add openai
```

Cosine across spaces carries no signal, so don't expect 1.0. For
sign-off, run your retrieval eval against both legs.
