Hybrid Deployment
Hybrid deployment runs the Rivano gateway on your own infrastructure while the control plane (policies, traces, cost tracking, dashboard) remains hosted at api.rivano.ai. AI request content never leaves your network; only metadata (token counts, latency, cost, policy decisions) is sent to the control plane.
Architecture
Your Application
│
▼
┌─────────────────────────────┐
│ Your Network │
│ │
│ ┌─────────────────────┐ │
│ │ Rivano Gateway │ │
│ │ (self-hosted) │ │
│ │ │ │
│ │ • Policy eval │ │
│ │ • PII detection │ │
│ │ • Injection score │ │
│ │ • Provider forward │ │
│ └────────┬────────────┘ │
│ │ metadata only │
└───────────│─────────────────┘
│
▼
api.rivano.ai
(control plane)
• Policy sync
• Trace ingest (metadata)
• Cost calculation
• Dashboard
Content stays local. Request bodies, response bodies, and PII are processed inside the gateway and never forwarded to the control plane. Only metadata is transmitted: token counts, duration, cost, injection score, entity types detected (not values), and policy decisions.
Prerequisites
- Docker or Kubernetes environment
- Outbound HTTPS access to
api.rivano.ai(port 443) - An
ingest-scoped Rivano API key
Gateway configuration
Create a gateway.yaml configuration file:
# gateway.yaml
rivano:
api_key: "${RIVANO_API_KEY}" # ingest-scoped key
base_url: "https://api.rivano.ai" # control plane URL
sync_interval: 30 # seconds between config polls
sync_etag: true # use ETag for efficient polling
server:
host: "0.0.0.0"
port: 8080
providers:
- name: openai
base_url: "https://api.openai.com"
api_key: "${OPENAI_API_KEY}"
- name: anthropic
base_url: "https://api.anthropic.com"
api_key: "${ANTHROPIC_API_KEY}"
telemetry:
trace_metadata: true # send metadata to control plane
trace_content: false # NEVER send content — this is the hybrid guarantee
batch_size: 100
flush_interval_ms: 5000
trace_content: false is the critical setting for data residency. With this setting, request and response bodies are processed locally and never leave your network. Do not set it to true unless you explicitly want content in the hosted control plane.
Docker deployment
# Dockerfile (or use the pre-built image)
FROM rivano/gateway:latest
COPY gateway.yaml /app/gateway.yaml
ENV RIVANO_API_KEY=""
ENV OPENAI_API_KEY=""
ENV ANTHROPIC_API_KEY=""
EXPOSE 8080
CMD ["rivano-gateway", "--config", "/app/gateway.yaml"]
docker run -d \
--name rivano-gateway \
-p 8080:8080 \
-e RIVANO_API_KEY="rv_ingest_..." \
-e OPENAI_API_KEY="sk-..." \
-e ANTHROPIC_API_KEY="sk-ant-..." \
-v $(pwd)/gateway.yaml:/app/gateway.yaml \
rivano/gateway:latest
Kubernetes deployment
# k8s/gateway-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: rivano-gateway
spec:
replicas: 2
selector:
matchLabels:
app: rivano-gateway
template:
metadata:
labels:
app: rivano-gateway
spec:
containers:
- name: gateway
image: rivano/gateway:latest
ports:
- containerPort: 8080
env:
- name: RIVANO_API_KEY
valueFrom:
secretKeyRef:
name: rivano-secrets
key: api-key
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: rivano-secrets
key: openai-api-key
volumeMounts:
- name: config
mountPath: /app/gateway.yaml
subPath: gateway.yaml
volumes:
- name: config
configMap:
name: rivano-gateway-config
---
apiVersion: v1
kind: Service
metadata:
name: rivano-gateway
spec:
selector:
app: rivano-gateway
ports:
- port: 8080
targetPort: 8080
Config polling
The gateway polls api.rivano.ai every 30 seconds (configurable via sync_interval) to pick up policy changes. Polling uses HTTP ETags to avoid re-downloading unchanged config.
GET https://api.rivano.ai/api/gateway/config
Authorization: Bearer rv_ingest_...
If-None-Match: "etag-abc123"
→ 304 Not Modified (no config change, no bandwidth used)
→ 200 OK (updated config, new ETag)
Policy changes you make in the dashboard take effect in the gateway within sync_interval seconds. For immediate propagation, restart the gateway container.
Data residency guarantees
With trace_content: false:
| Data | Where it stays |
|---|---|
| Request prompt / messages | Your network only |
| Response content | Your network only |
| PII values | Your network only |
| Token counts | Sent to control plane |
| Request duration | Sent to control plane |
| Injection score (numeric) | Sent to control plane |
| PII entity types (not values) | Sent to control plane |
| Cost calculation | Calculated in control plane from token counts |
Offline mode
If the gateway loses connectivity to the control plane, it continues to proxy requests using its last-synced policy config. Trace metadata is buffered locally and flushed when connectivity is restored (up to 10,000 buffered traces).
Related
- Gateway Overview — Full gateway configuration reference
- Self-Hosting — Deploy the full stack on your infrastructure
- Migrate from Direct — Change your base URL to point at the gateway