Schema Design Principles
Comprehensive design principles for creating high-quality, maintainable JSON schemas in the Ontopix ecosystem.
Core Philosophy
Ontopix schemas follow three fundamental principles:
- Strictness: Explicit over implicit, constrained over open
- Self-Description: Schemas document themselves completely
- Immutability: Published schemas never change; version instead
These principles ensure data integrity, clear contracts, and smooth evolution across all consuming services.
1. Strictness and Explicitness
No Implicit Properties
Rule: Set additionalProperties: false at ALL object levels.
Why: Prevent unexpected fields, catch typos, enforce data contracts.
Example:
{
"type": "object",
"properties": {
"user": {
"type": "object",
"properties": {
"id": {"type": "string"},
"name": {"type": "string"}
},
"required": ["id", "name"],
"additionalProperties": false // ← Required at nested level too
}
},
"required": ["user"],
"additionalProperties": false // ← Required at root level
}
Violation Detection:
task test:structure # Checks all objects have additionalProperties: false
Explicit Types
Rule: Every field must have an explicit type declaration.
Why: Prevent ambiguity, enable type-safe code generation.
Good:
{
"age": {
"type": "integer",
"minimum": 0,
"maximum": 150
}
}
Bad:
{
"age": {
"minimum": 0 // ❌ Missing type declaration
}
}
Required vs Optional
Rule: Clearly define which fields are required.
Why: Explicit contracts prevent confusion and errors.
Good:
{
"type": "object",
"properties": {
"id": {"type": "string"},
"name": {"type": "string"},
"description": {"type": "string"}
},
"required": ["id", "name"] // description is optional
}
Bad:
{
"type": "object",
"properties": {
"id": {"type": "string"},
"name": {"type": "string"}
}
// ❌ Missing required field declaration
}
Enum Constraints
Rule: Use enum for fields with limited valid values.
Why: Prevent invalid data, document allowed values.
Example:
{
"status": {
"type": "string",
"enum": ["pending", "active", "completed", "failed"],
"description": "Current status of the process"
}
}
Avoid Ambiguity
Rule: Prefer explicit structures over oneOf/anyOf when possible.
Why: Simpler validation, clearer contracts, better error messages.
Good (explicit):
{
"source_type": {
"type": "string",
"enum": ["file", "api", "database"]
},
"file_path": {
"type": "string",
"description": "Required when source_type is 'file'"
}
}
Acceptable (when truly needed):
{
"contact": {
"oneOf": [
{
"type": "object",
"properties": {
"email": {"type": "string", "format": "email"}
},
"required": ["email"]
},
{
"type": "object",
"properties": {
"phone": {"type": "string", "pattern": "^\\+?[0-9]+$"}
},
"required": ["phone"]
}
]
}
}
2. Self-Description
Schema Identification
Rule: Every schema must include schema_type and schema_version.
Why: Self-identifying data, version tracking, validation routing.
Required fields:
{
"schema_type": {
"type": "string",
"const": "customer-interaction",
"description": "Schema type identifier"
},
"schema_version": {
"type": "string",
"pattern": "^v\\d+\\.\\d+(-(alpha|beta|rc)\\d+)?$",
"description": "Schema version in vMAJOR.MINOR[-stage] format"
}
}
Enforcement:
task test:structure # Validates schema_type and schema_version presence
Metadata Fields
Rule: Include $schema, $id, title, and description at root level.
Why: JSON Schema compliance, documentation, tooling support.
Example:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://schemas.platform.ontopix.ai/customer-interaction/v1.0-beta1.json",
"title": "Customer Interaction Schema",
"description": "Unified schema for multi-channel customer interactions with detailed metrics, semantic analysis, and event-based structure"
}
Field Documentation
Rule: Every property must have a clear description.
Why: Self-documenting schemas, reduced need for external docs.
Good:
{
"response_time": {
"type": "number",
"minimum": 0,
"description": "Time in seconds from customer message to agent response"
}
}
Bad:
{
"response_time": {
"type": "number" // ❌ Missing description
}
}
3. Evolution and Compatibility
Immutability
Rule: Published schemas are immutable; create new versions for changes.
Why: Stability for consumers, clear version history, safe deployments.
Workflow:
# ❌ NEVER do this
edit audit-criteria/v1.0-beta1.json
# ✅ ALWAYS do this
cp audit-criteria/v1.0-beta1.json audit-criteria/v1.0.json
# Edit v1.0.json with your changes
# Update schema_version and $id
Backward Compatibility by Default
Rule: Prefer MINOR version bumps (backward-compatible) over MAJOR (breaking).
Why: Minimize disruption, allow gradual migration.
MINOR changes (backward-compatible):
- Adding optional fields
- Adding enum values
- Relaxing constraints (e.g., increasing
maxLength) - Adding definitions to
$defs
MAJOR changes (breaking):
- Removing required fields
- Changing field types
- Removing enum values
- Tightening constraints
- Restructuring object hierarchy
Deprecation Path
Rule: When introducing breaking changes, document migration paths.
Why: Help consumers migrate, maintain service continuity.
Example (in schema .md file):
## Migration from v1.0 to v2.0
### Breaking Changes
1. **Field Rename**: `customer_interaction_id` → `id`
- Update all references to use new field name
2. **Type Change**: `metadata` from object → flexible bag
- Wrap existing metadata in appropriate structure
### Migration Script
See `scripts/migrate_v1_to_v2.py` for automated conversion.
Version Coexistence
Rule: Multiple versions can coexist; don't remove old versions.
Why: Support gradual migration, avoid forced upgrades.
Directory structure:
audit-criteria/
├── v1.0-alpha1.json # Still available
├── v1.0-beta1.json # Still available
└── v1.0.json # Latest stable
4. Validation Rigor
Comprehensive Constraints
Rule: Use validation keywords where applicable.
Available constraints:
- Numbers:
minimum,maximum,multipleOf - Strings:
minLength,maxLength,pattern - Arrays:
minItems,maxItems,uniqueItems - Objects:
minProperties,maxProperties
Example:
{
"email": {
"type": "string",
"format": "email",
"minLength": 5,
"maxLength": 254,
"description": "Valid email address per RFC 5322"
},
"age": {
"type": "integer",
"minimum": 0,
"maximum": 150,
"description": "Age in years"
},
"tags": {
"type": "array",
"items": {"type": "string"},
"minItems": 1,
"maxItems": 10,
"uniqueItems": true,
"description": "Between 1 and 10 unique tags"
}
}
Format Validation
Rule: Use pattern for structured strings.
Why: Prevent malformed data (IDs, dates, codes).
Examples:
{
"id": {
"type": "string",
"pattern": "^int-[a-zA-Z0-9]+$",
"description": "Interaction ID in format 'int-{alphanumeric}'"
},
"date": {
"type": "string",
"pattern": "^\\d{4}-\\d{2}-\\d{2}$",
"description": "Date in YYYY-MM-DD format"
},
"uuid": {
"type": "string",
"pattern": "^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$",
"description": "UUID v4 format"
}
}
Range Constraints
Rule: Specify valid ranges for numeric fields.
Why: Prevent nonsensical values, document expectations.
Example:
{
"score": {
"type": "number",
"minimum": 0.0,
"maximum": 1.0,
"description": "Normalized score between 0.0 and 1.0"
},
"percentage": {
"type": "integer",
"minimum": 0,
"maximum": 100,
"description": "Percentage value from 0 to 100"
}
}
Array Constraints
Rule: Define size and uniqueness constraints for arrays.
Why: Prevent empty arrays, huge lists, duplicate entries.
Example:
{
"participants": {
"type": "array",
"items": {"$ref": "#/$defs/Participant"},
"minItems": 1,
"description": "At least one participant required"
},
"tags": {
"type": "array",
"items": {"type": "string"},
"maxItems": 20,
"uniqueItems": true,
"description": "Up to 20 unique tags"
}
}
5. Reusability and Maintainability
DRY Principle
Rule: Use $ref and $defs for repeated structures.
Why: Reduce duplication, maintain consistency, simplify updates.
Example:
{
"$defs": {
"Participant": {
"type": "object",
"properties": {
"id": {"type": "string"},
"role": {
"type": "string",
"enum": ["customer", "agent", "supervisor"]
}
},
"required": ["id", "role"],
"additionalProperties": false
}
},
"properties": {
"participants": {
"type": "array",
"items": {"$ref": "#/$defs/Participant"}
}
}
}
Semantic Naming
Rule: Use descriptive, domain-specific names.
Why: Clear intent, easier understanding, better documentation.
Good:
{
"customer_interaction_id": {"type": "string"},
"evaluation_timestamp": {"type": "string"},
"compliance_score": {"type": "number"}
}
Bad:
{
"id": {"type": "string"}, // Which ID?
"timestamp": {"type": "string"}, // Timestamp of what?
"score": {"type": "number"} // Score for what?
}
Consistent Naming
Rule: Follow a consistent naming convention.
Why: Predictability, reduced cognitive load.
Standard: snake_case for JSON field names (Ontopix convention)
Example:
{
"customer_interaction_id": "...",
"evaluation_timestamp": "...",
"participant_count": 5
}
Modular Definitions
Rule: Define reusable types in $defs.
Why: Centralized definitions, easier updates, consistency.
Example:
{
"$defs": {
"TimeRange": {
"type": "object",
"properties": {
"start": {"type": "string", "format": "date-time"},
"end": {"type": "string", "format": "date-time"}
},
"required": ["start"],
"additionalProperties": false
}
},
"properties": {
"interaction_time": {"$ref": "#/$defs/TimeRange"},
"evaluation_time": {"$ref": "#/$defs/TimeRange"}
}
}
6. Documentation Completeness
Paired Documentation
Rule: Every .json schema has a corresponding .md file.
Why: Detailed explanations, examples, migration guides.
Structure:
customer-interaction/
├── v1.0-beta1.json # Schema definition
└── v1.0-beta1.md # Documentation
Documentation contents:
- Overview and purpose
- Field descriptions
- Complete examples
- Validation rules
- Edge cases
- Migration guides (for new versions)
Examples Included
Rule: Provide complete, realistic examples.
Why: Understanding, testing, reference implementations.
Example (in .md file):
## Complete Example
```json
{
"schema_type": "customer-interaction",
"schema_version": "v1.0-beta1",
"id": "int-12345",
"source": {
"media": "text",
"channel": "email",
"synchronicity": "asynchronous"
},
"time": {
"start": "2026-01-28T10:30:00Z"
},
"participants": [
{"id": "customer-001", "role": "customer"},
{"id": "agent-001", "role": "agent"}
],
"events": [...]
}
### Edge Cases Documented
**Rule**: Explain special cases and constraints.
**Why**: Prevent misuse, clarify intentions.
**Example** (in `.md` file):
```markdown
## Edge Cases
### Empty Participant Lists
- **Not Allowed**: `participants` must have at least one item
- **Validation**: `minItems: 1` enforces this constraint
### Overlapping Time Ranges
- **Allowed**: Events can have overlapping time ranges
- **Use Case**: Concurrent speakers in multi-party calls
### Missing End Time
- **Allowed**: `time.end` is optional for ongoing interactions
- **Interpretation**: Interaction is still in progress
Usage Guidance
Rule: Include common use cases and patterns.
Why: Help developers integrate schemas correctly.
Example (in .md file):
## Common Use Cases
### Use Case 1: Email Interaction
```json
{
"source": {
"media": "text",
"channel": "email",
"synchronicity": "asynchronous"
}
}
Use Case 2: Live Phone Call
{
"source": {
"media": "audio",
"channel": "phone",
"synchronicity": "synchronous"
}
}
## 7. Standards Compliance
### JSON Schema Draft-07
**Rule**: All schemas comply with Draft-07 specification.
**Why**: Tool compatibility, validation consistency.
**Required declaration**:
```json
{
"$schema": "http://json-schema.org/draft-07/schema#"
}
Validation:
task test:syntax # Validates JSON Schema compliance
Industry Standards
Rule: Follow domain-specific standards.
Why: Interoperability, familiarity, correctness.
Examples:
- ISO 8601: Date-time formats
- RFC 5322: Email addresses
- ISO 639: Language codes
- ISO 3166: Country codes
Consistent Types
Rule: Use standard types consistently.
Why: Predictability, tooling support.
Standard types:
{
"timestamp": {"type": "string", "format": "date-time"},
"email": {"type": "string", "format": "email"},
"uri": {"type": "string", "format": "uri"},
"uuid": {"type": "string", "format": "uuid"}
}
8. Cross-Schema Consistency
Shared Vocabularies
Rule: Reuse enum values and field names across related schemas.
Why: Consistency, predictability, reduced learning curve.
Example:
// Used in multiple schemas
"severity": {
"type": "string",
"enum": ["critical", "major", "minor", "warning", "info"]
}
Consistent Patterns
Rule: Use similar structures for similar concepts.
Why: Familiarity, code reuse, reduced cognitive load.
Example (participant structure used consistently):
"participants": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {"type": "string"},
"role": {"type": "string"}
}
}
}
Dependency Documentation
Rule: Clearly document relationships between schemas.
Why: Understanding data flows, validation chains.
Example (in .md file):
## Schema Dependencies
This schema references:
- `customer-interaction@v1.0-beta1` - Source data for evaluation
- `audit-criteria@v1.0-beta1` - Criteria definitions
Referenced by:
- Reporting dashboards
- Compliance systems
Aligned Constraints
Rule: When schemas reference each other, ensure constraint alignment.
Why: Prevent validation mismatches, ensure referential integrity.
Example:
// In AuditCriteria
"target": {
"schema": {
"type": "string",
"pattern": "^[a-z-]+@v\\d+\\.\\d+(-(alpha|beta|rc)\\d+)?$"
}
}
// In AuditResult (must match)
"evaluated_objects": {
"items": {
"schema": {
"type": "string",
"pattern": "^[a-z-]+@v\\d+\\.\\d+(-(alpha|beta|rc)\\d+)?$"
}
}
}
Compliance Checklist
Before publishing a schema, verify:
-
additionalProperties: falseon ALL object levels -
schema_typefield with const value -
schema_versionfield with pattern validation -
$schema,$id,title,descriptionat root - All fields have
description - All fields have explicit
type -
requiredfields clearly defined - Validation constraints where applicable
- Reusable definitions in
$defs - Consistent naming convention
- Paired
.mddocumentation file - Complete, realistic examples
- Migration guide (for new versions)
- Edge cases documented
Automated checks:
task test:syntax # JSON syntax validation
task test:structure # Schema structure validation
task lint:check # Formatting consistency
See Also
- Creating Schemas Guide - Step-by-step schema creation
- Versioning Guide - Version management rules
- Schema Templates - Starter templates
- Compliance Checklist - Quality verification
Last Updated: 2026-01-28 Principles Count: 8 Status: Stable