---
title: Chroma
description: Use SIE embeddings with ChromaDB for vector search.
canonical_url: https://superlinked.com/docs/integrations/chroma
last_updated: 2026-05-20
---

The `sie-chroma` package (Python) and `@superlinked/sie-chroma` package (TypeScript) provide embedding functions for ChromaDB. Use `SIEEmbeddingFunction` for dense embeddings in standard collections. Use `SIESparseEmbeddingFunction` for hybrid search on Chroma Cloud.

## Installation

#### Python

```bash
pip install sie-chroma
```
This installs `sie-sdk` and `chromadb` as dependencies.

#### TypeScript

```bash
pnpm add @superlinked/sie-chroma
```
This installs `@superlinked/sie-sdk` and `chromadb` as dependencies.

## Start the Server

Source: [packages/sie_server/src/sie_server/cli.py](https://github.com/superlinked/sie/blob/main/packages/sie_server/src/sie_server/cli.py)

```bash
# Docker (recommended)
docker run -p 8080:8080 ghcr.io/superlinked/sie-server:latest-cpu-default

# Or with GPU
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie-server:latest-cuda12-default
```

## Embedding Function

Source: [integrations/sie_chroma/src/sie_chroma/embedding_function.py](https://github.com/superlinked/sie/blob/main/integrations/sie_chroma/src/sie_chroma/embedding_function.py)

`SIEEmbeddingFunction` implements ChromaDB's `EmbeddingFunction` protocol. Use it when creating or querying collections.

#### Python

```python
from sie_chroma import SIEEmbeddingFunction

embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)
```

#### TypeScript

```typescript
import { SIEEmbeddingFunction } from "@superlinked/sie-chroma";

const embeddingFunction = new SIEEmbeddingFunction({
  baseUrl: "http://localhost:8080",
  model: "BAAI/bge-m3",
});
```

### Configuration Options

#### Python

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `base_url` | `str` | `http://localhost:8080` | SIE server URL |
| `model` | `str` | `BAAI/bge-m3` | Model to use for embeddings |
| `gpu` | `str` | `None` | Target GPU type for routing |
| `options` | `dict` | `None` | Model-specific options |
| `timeout_s` | `float` | `180.0` | Request timeout in seconds |

#### TypeScript

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `baseUrl` | `string` | `http://localhost:8080` | SIE server URL |
| `model` | `string` | `BAAI/bge-m3` | Model to use for embeddings |
| `gpu` | `string` | `undefined` | Target GPU type for routing |
| `timeout` | `number` | `180000` | Request timeout in milliseconds |

## Full Example

Source: [integrations/sie_chroma/src/sie_chroma/embedding_function.py](https://github.com/superlinked/sie/blob/main/integrations/sie_chroma/src/sie_chroma/embedding_function.py)

Create a ChromaDB collection with SIE embeddings and perform similarity search:

#### Python

```python
import chromadb
from sie_chroma import SIEEmbeddingFunction

# Initialize the embedding function
embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Create a Chroma client and collection
client = chromadb.Client()
collection = client.create_collection(
    name="documents",
    embedding_function=embedding_function,
)

# Add documents
collection.add(
    documents=[
        "Machine learning is a subset of artificial intelligence.",
        "Neural networks are inspired by biological neurons.",
        "Deep learning uses multiple layers of neural networks.",
        "Python is popular for machine learning development.",
    ],
    ids=["doc1", "doc2", "doc3", "doc4"],
)

# Query the collection
results = collection.query(
    query_texts=["What is deep learning?"],
    n_results=2,
)

for doc, distance in zip(results["documents"][0], results["distances"][0]):
    print(f"{distance:.4f}: {doc}")
```

#### TypeScript

```typescript
import { ChromaClient } from "chromadb";
import { SIEEmbeddingFunction } from "@superlinked/sie-chroma";

// Initialize the embedding function
const embeddingFunction = new SIEEmbeddingFunction({
  baseUrl: "http://localhost:8080",
  model: "BAAI/bge-m3",
});

// Create a Chroma client and collection
const client = new ChromaClient();
const collection = await client.createCollection({
  name: "documents",
  embeddingFunction,
});

// Add documents
await collection.add({
  documents: [
    "Machine learning is a subset of artificial intelligence.",
    "Neural networks are inspired by biological neurons.",
    "Deep learning uses multiple layers of neural networks.",
    "Python is popular for machine learning development.",
  ],
  ids: ["doc1", "doc2", "doc3", "doc4"],
});

// Query the collection
const results = await collection.query({
  queryTexts: ["What is deep learning?"],
  nResults: 2,
});

for (let i = 0; i < results.documents[0].length; i++) {
  const doc = results.documents[0][i];
  const distance = results.distances?.[0][i];
  console.log(`${distance?.toFixed(4)}: ${doc}`);
}
```

### With Persistent Storage

#### Python

```python
import chromadb
from sie_chroma import SIEEmbeddingFunction

embedding_function = SIEEmbeddingFunction(model="BAAI/bge-m3")

# Use persistent storage
client = chromadb.PersistentClient(path="./chroma_db")

collection = client.get_or_create_collection(
    name="my_collection",
    embedding_function=embedding_function,
)
```

#### TypeScript

```typescript
import { ChromaClient } from "chromadb";
import { SIEEmbeddingFunction } from "@superlinked/sie-chroma";

const embeddingFunction = new SIEEmbeddingFunction({ model: "BAAI/bge-m3" });

// Use persistent storage (requires chroma server running)
const client = new ChromaClient({ path: "http://localhost:8000" });

const collection = await client.getOrCreateCollection({
  name: "my_collection",
  embeddingFunction,
});
```

## Sparse Embeddings (Chroma Cloud)

Source: [integrations/sie_chroma/src/sie_chroma/embedding_function.py](https://github.com/superlinked/sie/blob/main/integrations/sie_chroma/src/sie_chroma/embedding_function.py)

`SIESparseEmbeddingFunction` generates sparse embeddings for Chroma Cloud hybrid search. Use it with `SparseVectorIndexConfig`.

#### Python

```python
from sie_chroma import SIESparseEmbeddingFunction

sparse_ef = SIESparseEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)
```

The sparse embedding function returns `dict[int, float]` mappings of token indices to weights. This format is compatible with Chroma Cloud's hybrid search feature.

#### TypeScript

```typescript
import { SIESparseEmbeddingFunction } from "@superlinked/sie-chroma";

const sparseEf = new SIESparseEmbeddingFunction({
  baseUrl: "http://localhost:8080",
  model: "BAAI/bge-m3",
});

// Generate sparse embeddings
const embeddings = await sparseEf.generate(["Hello world"]);
console.log(embeddings[0].indices); // [1, 5, 10, ...]
console.log(embeddings[0].values);  // [0.5, 0.3, 0.2, ...]

// Or as dict format for Chroma Cloud
const dictEmbeddings = await sparseEf.generateAsDict(["Hello world"]);
console.log(dictEmbeddings[0]); // { 1: 0.5, 5: 0.3, 10: 0.2, ... }
```

The sparse embedding function returns `{ indices: number[], values: number[] }` objects or `Record<number, number>` dicts (via `generateAsDict`). Both formats are compatible with Chroma Cloud's hybrid search feature.

## Multimodal Embeddings

ChromaDB's embedding function interface accepts text only. For image embedding with models like CLIP or SigLIP, use the SIE SDK to encode images and pass the pre-computed embeddings to ChromaDB:

```python
from sie_sdk import SIEClient
from sie_sdk.types import Item
import chromadb

sie = SIEClient("http://localhost:8080")
chroma = chromadb.Client()
collection = chroma.create_collection("images")

# Encode images with SIE SDK
results = sie.encode(
    "openai/clip-vit-large-patch14",
    [Item(images=["img1.jpg"]), Item(images=["img2.jpg"])],
    output_types=["dense"]
)

# Store pre-computed embeddings in Chroma
collection.add(
    ids=["img1", "img2"],
    embeddings=[r["dense"].tolist() for r in results],
    metadatas=[{"path": "img1.jpg"}, {"path": "img2.jpg"}]
)
```

See [Encode](/docs/encode/) for full SDK documentation and the [Model Catalog](/models#task=encode) for supported vision models.

## What's Next

- [Encode Text](/docs/encode/) - embedding API details and output types
- [Model Catalog](/models#task=encode) - all supported embedding models
- [Troubleshooting](/docs/reference/troubleshooting/) - common errors and solutions
