---
title: How to make AI agent infrastructure portable across AWS, GCP, Azure, and customer clouds
description: "Keep the inference layer as one portable artifact: the same Docker image, Helm chart, and SDK calls on any Kubernetes cluster, from a laptop to any cloud."
canonical_url: https://superlinked.com/blog/portable-ai-agent-infrastructure-aws-gcp-azure
last_updated: 2026-06-16
---

Make the inference layer portable by running the same artifact everywhere: one Docker image, one Helm chart, and identical SDK code from a laptop to any Kubernetes cluster.

The Superlinked Inference Engine (SIE) is built this way.

The same image runs locally and in production with no separate production mode, it ships Terraform modules for AWS and GCP, it runs on any Kubernetes (Azure AKS and on-premise included), and it has an offline path for customer environments.

*Open source: [github.com/superlinked/sie](https://github.com/superlinked/sie)*.

<BlogSieCta />

## How can I make AI agent infrastructure portable across AWS, GCP, Azure, and customer environments?

Keep the inference layer as one portable artifact: the same Docker image, the same Helm chart, and the same SDK calls on any Kubernetes cluster. Moving clouds then means redeploying that artifact, not rewriting the stack.

## What actually has to move

The hardest part of an agent to relocate is the small-model inference: embeddings, reranking, extraction, document parsing. Tie that to one cloud's managed service and every new environment becomes a porting project. SIE keeps the layer self-contained, an open-source server you run on GPUs you control, so changing clouds means redeploying the same artifact rather than rewriting the stack.

## Same image, every environment

The Docker image is the unit of portability. It auto-detects CUDA, Apple Silicon, or CPU, so laptop and cluster run identical code:

```bash
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie-server:latest-cuda12-default
```

```python
from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")
client.encode("BAAI/bge-m3", Item(text="runs the same everywhere"))
```

## AWS and GCP: Terraform modules

Both have ready modules, so a cluster is a short module block plus a Helm install:

```hcl
module "sie" {                 # AWS
  source = "superlinked/sie/aws"
  region = "us-east-1"
  gpus   = ["a100-40gb", "l4-spot"]
}

module "sie" {                 # GCP
  source = "superlinked/sie/google"
  region = "us-central1"
  gpus   = ["a100-40gb", "l4-spot"]
}
```

Guides: [Kubernetes in AWS](/docs/deployment/cloud-aws) and [Kubernetes in GCP](/docs/deployment/cloud-gcp).

## Azure and on-premise: same chart, your cluster

SIE runs on any Kubernetes, which covers Azure AKS and on-premise through the same Docker image and Helm chart:

```bash
helm upgrade --install sie-cluster oci://ghcr.io/superlinked/charts/sie-cluster \
  --namespace sie --create-namespace \
  --set hfToken.create=true --set hfToken.value=<TOKEN>
```

The pre-built Terraform modules target AWS and GCP today, so on Azure or bare metal you provision the GPU cluster with your own infrastructure code and install the same chart. The application layer does not change, only how the cluster underneath is created.

## Customer environments: offline and air-gapped

This is where most "deploy in our cloud" requirements get decided. SIE supports air-gapped deployment with model-weight snapshots, so the cluster needs no run-time access to Hugging Face, and documents never leave the host. See [Offline and air-gapped deployment](/docs/deployment/offline).

Caching travels with the deployment too: a local disk cache, an optional shared cluster cache on an S3 or GCS bucket to avoid redundant downloads, and Hugging Face Hub only on a cold miss.

## FAQ: portability specifics

**Will the exact same code run on my laptop and in production?** Yes. The same Docker image and the same SDK calls work in both. There is no separate production mode to maintain.

**Is there a prebuilt Terraform module for Azure?** Not currently. The provided modules target AWS and GCP. On Azure, run the same Helm chart on AKS and provision the cluster with your own infrastructure code.

**Can it run fully offline inside a customer's environment?** Yes. Air-gapped deployment uses model-weight snapshots, and no document data leaves the host running the cluster.

**Does moving between clouds change my agent's application code?** No. Your `encode`, `score`, and `extract` calls point at a base URL. Only the cluster provisioning underneath it changes.

*Take the same image from your laptop to whichever cloud the customer runs: [github.com/superlinked/sie](https://github.com/superlinked/sie).* Deployment paths are in the [deployment overview](/docs/deployment).