Howto

How to Add a New Vendor

Step-by-step guide to adding a new vendor rate limit bucket to Sluice via Terraform.

This guide walks through adding a new vendor rate limit bucket to Sluice. After completing these steps, all Ontopix products using the Sluice SDK will be able to acquire slots against the new vendor dimension.

Prerequisites

  • Access to the .infra/ Terraform configuration
  • Vendor documentation with their rate limit values
  • Terraform CLI and AWS credentials for the target environment

Step 1: Determine vendor limits

Check the vendor's API documentation for their rate limit values. You need three pieces of information:

WhatWhere to find itExample
Limit valueVendor docs, usually in headers (X-RateLimit-Limit)60 requests per minute
Limit windowVendor docsper minute, per second, concurrent
Cost per callHow many units each API call consumes1 for RPM, token count for TPM

Limit types

limit_typeWhen to userefill_rate formula
requestsRequests per time window (RPM, RPS)capacity / window_seconds
tokensToken budget per time window (TPM)capacity / window_seconds
concurrentMax simultaneous connections0 (no refill -- tokens restored on release)

Step 2: Calculate refill_rate

For time-based limits, divide capacity by the window in seconds:

refill_rate = capacity / window_seconds

Examples:

Vendor limitcapacitywindow_secondsrefill_rate
500 RPM500608.333
200,000 TPM200000603333.33
60 RPM60601.0
10 RPS10110.0

For concurrent limits, set refill_rate = 0.

Step 3: Add the Terraform resource

Open .infra/vendors.tf and add an aws_dynamodb_table_item resource:

resource "aws_dynamodb_table_item" "anthropic_rpm" {
  table_name = aws_dynamodb_table.vendor_buckets.name
  hash_key   = aws_dynamodb_table.vendor_buckets.hash_key

  item = jsonencode({
    vendor_dimension = { S = "anthropic#rpm" }
    capacity         = { N = "60" }
    refill_rate      = { N = "1.0" }
    tokens           = { N = "60" }
    last_refill_at   = { N = "0" }
    cost_per_call    = { N = "1" }
    limit_type       = { S = "requests" }
    version          = { N = "0" }
  })
}

Attribute reference

AttributeTypeDescription
vendor_dimensionSPartition key. Format: {vendor}#{dimension} (e.g. anthropic#rpm)
capacityNMaximum token count (= vendor limit value)
refill_rateNTokens restored per second. 0 for concurrent limits
tokensNInitial token count. Set equal to capacity
last_refill_atNUnix timestamp of last refill. Set to 0 for new buckets
cost_per_callNTokens consumed per acquire() call
limit_typeSOne of: requests, tokens, concurrent
versionNOptimistic lock counter. Set to 0 for new buckets

Step 4: Add the sandbox seed

Open sandbox/init.sh and add a matching put-item block so local development picks up the new bucket:

echo "Seeding bucket: anthropic#rpm"
awslocal dynamodb put-item \
  --table-name "${TABLE_NAME}" \
  --item '{
    "vendor_dimension": {"S": "anthropic#rpm"},
    "capacity": {"N": "60"},
    "refill_rate": {"N": "1.0"},
    "tokens": {"N": "60"},
    "last_refill_at": {"N": "0"},
    "cost_per_call": {"N": "1"},
    "limit_type": {"S": "requests"},
    "version": {"N": "0"}
  }' \
  --region "${REGION}"

After editing, reset the sandbox to pick up the change:

task sandbox:reset

Step 5: Deploy to AWS

Plan and apply for the target environment:

task infra:plan ENV=pre
# Review the plan output carefully
task infra:apply ENV=pre

The new bucket is live immediately after apply completes. No SDK changes or redeployments are needed -- all products using the Sluice SDK will be able to acquire slots against the new dimension right away.

Step 6: Verify from an SDK

from sluice import slot

async with slot("anthropic#rpm", timeout=10) as s:
    print(f"Acquired slot: {s.lease_id}")
    # Make your Anthropic API call here

If the slot is granted, the bucket is working correctly. If you get a RETRY_IN response immediately on a fresh bucket, check that tokens was set equal to capacity in the Terraform resource.

Complete example: adding anthropic#rpm with capacity=60

Anthropic's rate limit is 60 requests per minute.

  1. capacity = 60
  2. refill_rate = 60 / 60 = 1.0 tokens per second
  3. limit_type = requests
  4. cost_per_call = 1

Add the Terraform resource shown in Step 3 above, add the sandbox seed shown in Step 4, then deploy per Step 5.