Rivano tracks the cost of every AI request in real-time, providing granular attribution by agent, model, team, and user. No sampling — every token is counted and priced.

How It Works

When a request flows through the Rivano proxy, three things happen:

Token counting — Rivano counts input and output tokens from the provider’s response (using the provider’s reported usage, not estimates).
Model pricing — Tokens are multiplied by the model’s current pricing. Rivano maintains an up-to-date pricing table for all supported providers.
Attribution — Cost is attributed to the agent, and optionally to a team or user via the X-Rivano-User header.

Adding User Attribution

Pass the X-Rivano-User header to attribute costs to specific users or teams:

const client = new OpenAI({
  baseURL: "https://proxy.rivano.ai/v1",
  defaultHeaders: {
    "X-Rivano-Agent": process.env.RIVANO_AGENT_ID,
    "X-Rivano-Key": process.env.RIVANO_API_KEY,
    "X-Rivano-User": "user:jane@example.com",
  },
});

Viewing Costs

Dashboard

The Costs page shows a real-time breakdown of your AI spend:

Total spend — daily, weekly, monthly, or custom range
By agent — which agents are driving the most cost
By model — compare spend across GPT-4o, Claude, Gemini, etc.
By user — who on your team is using the most tokens (requires X-Rivano-User header)
Cost trend — line chart showing spend over time with anomaly highlighting

API

curl "https://api.rivano.ai/v1/costs/summary?period=7d" \
  -H "Authorization: Bearer rv_live_abc123"

Response:

{
  "period": "7d",
  "total_cost": 127.43,
  "total_requests": 8432,
  "total_input_tokens": 2145000,
  "total_output_tokens": 1876000,
  "by_agent": [
    { "agent_id": "agent_abc123", "name": "prod-assistant", "cost": 89.21 },
    { "agent_id": "agent_def456", "name": "support-bot", "cost": 38.22 }
  ],
  "by_model": [
    { "model": "gpt-4o", "cost": 102.15, "requests": 5200 },
    { "model": "gpt-4o-mini", "cost": 12.30, "requests": 2800 },
    { "model": "claude-sonnet-4-20250514", "cost": 12.98, "requests": 432 }
  ]
}

Budget Alerts

Set budget thresholds to get notified before costs spiral. Alerts can be configured per-agent, per-model, or organization-wide.

Via YAML Policy

name: daily-budget-alert
description: Alert when daily spend exceeds $50
status: active
priority: 20

conditions:
  direction: both

budget:
  threshold: 50.00
  period: daily
  scope: organization

action: log
notifications:
  - type: email
    recipients:
      - ops@example.com
  - type: slack
    webhook: https://hooks.slack.com/services/T00/B00/xxx
  - type: webhook
    url: https://example.com/api/budget-alert

Budget Actions

Threshold Type	Action	Description
Warning (80%)	`notify`	Send alert but allow requests
Limit (100%)	`notify`	Send alert, continue allowing requests
Hard limit	`block`	Reject new requests until the next period

To enforce a hard spending cap:

name: hard-budget-cap
description: Block requests when daily spend exceeds $100
status: active
priority: 1

conditions:
  direction: inbound

budget:
  threshold: 100.00
  period: daily
  scope: per_agent
  hard_limit: true

action: block
message: "Daily budget exceeded. Requests will resume tomorrow."

Model Pricing

Rivano’s pricing table is updated automatically when providers change prices. Current rates (as of January 2026):

Model	Input (per 1M tokens)	Output (per 1M tokens)
GPT-4o	$2.50	$10.00
GPT-4o-mini	$0.15	$0.60
Claude Sonnet 4	$3.00	$15.00
Claude Haiku 3.5	$0.80	$4.00
Gemini 2.0 Flash	$0.075	$0.30

For custom or fine-tuned models, set pricing manually in Settings → Model Pricing.

Tips for Reducing Costs

Use the right model — route simple tasks to cheaper models (GPT-4o-mini, Haiku) and reserve expensive models for complex reasoning.
Set per-agent budgets — prevent any single agent from consuming a disproportionate share.
Monitor output token ratios — if output tokens consistently exceed input by 10x+, your prompts may be too open-ended.
Cache frequent queries — Rivano’s cache hits are free and don’t count toward your budget.
Review weekly — the Costs dashboard highlights anomalies automatically.

Cost Tracking