Operations

Infrastructure Operations

Terraform layout, deployment workflow, and environment management for Sluice infrastructure.

Terraform layout

All infrastructure is defined in the .infra/ directory at the repository root.

FilePurpose
main.tfAWS provider, DynamoDB table resource
variables.tfInput variables (environment, table_name_prefix, aws_region)
vendors.tfVendor bucket items (aws_dynamodb_table_item per vendor dimension)
iam.tfConsumer IAM role and least-privilege policy
reconciler.tfReconciler Lambda, execution role, EventBridge schedule
backend.tfS3 remote state configuration
outputs.tfExported values (e.g., consumer_role_arn)

Environments

EnvironmentBackendTable nameDeployment
sandboxLocalStack (Docker)ontopix-vendor-buckets-sandboxtask sandbox:start
preAWS (eu-central-1)ontopix-vendor-buckets-pretask infra:apply ENV=pre
prodAWS (eu-central-1)ontopix-vendor-buckets-prodtask infra:apply ENV=prod

The sandbox does not use Terraform. It is managed by sandbox/init.sh via awslocal commands against LocalStack.

DynamoDB table

One table per environment: ontopix-vendor-buckets-{env}.

  • Billing mode: PAY_PER_REQUEST (on-demand)
  • Partition key: vendor_dimension (String)
  • TTL attribute: ttl (used by lease records)
  • No GSIs in v0.1.0

Environment isolation is at the table level -- partition keys contain no env prefix. A misconfigured SLUICE_TABLE_NAME points to a different table entirely, not to cross-environment data within the same table.

Item types stored in the table

Item typeKey patternExample
Vendor bucket{vendor}#{dimension}openai#rpm
Lease recordlease#{vendor}#{dimension}#{uuid}lease#openai#rpm#a1b2c3d4

Vendor buckets are Terraform-managed. Lease records are created/deleted by the SDK and reconciler.

IAM

Consumer role

Defined in .infra/iam.tf. One role per environment: sluice-consumer-{env}.

Granted actions on the vendor buckets table:

  • dynamodb:GetItem -- read bucket state (pre-read before transaction)
  • dynamodb:TransactWriteItems -- atomic slot acquisition
  • dynamodb:UpdateItem -- concurrent release (non-transactional token restore)
  • dynamodb:DeleteItem -- release a slot (delete lease record)

Consuming services reference the role ARN from the Terraform output consumer_role_arn.

Reconciler role

Defined in .infra/reconciler.tf. One role per environment: sluice-reconciler-{env}.

Additional action beyond the consumer role:

  • dynamodb:Scan -- required to find expired leases

Plus AWSLambdaBasicExecutionRole for CloudWatch Logs.

Deployment

Planning

task infra:plan ENV=pre

Review the plan output. Pay attention to:

  • Changes to aws_dynamodb_table_item resources (vendor bucket updates)
  • Any table-level changes (billing mode, key schema) -- these are destructive
  • Reconciler Lambda code changes (source hash)

Applying

task infra:apply ENV=pre

Always apply to pre first, verify behavior, then apply to prod.

terraform apply is never run in CI. It is always manual and requires human approval.

Common operations

OperationAction
Add a vendor bucketAdd aws_dynamodb_table_item to vendors.tf, plan, apply
Change a rate limitUpdate the capacity and refill_rate in vendors.tf, plan, apply
Update reconciler codeEdit python/src/sluice/_reconciler.py, plan (source hash changes), apply
Grant a new service accessAdd IAM policy in iam.tf or reference consumer_role_arn output

Vendor configuration changes propagate to all products instantly after apply -- no SDK redeployment needed.

State management

Terraform state is stored in S3 with DynamoDB locking:

SettingValue
Bucketontopix-tfstate
Keyservices/sluice/terraform.tfstate
Regioneu-west-1
EncryptionEnabled
Lock tableontopix-tflocks

The state bucket and lock table are in eu-west-1 (Ontopix shared infra region). The Sluice resources themselves are deployed to eu-central-1.

Resource tags

All resources are tagged via the provider default_tags:

Service     = "sluice"
Environment = "{env}"
ManagedBy   = "terraform"