Skip to main content

SDIF Specification

Overview

This section defines the normative specification for SDIF (Semantic Data Interchange Format) version 1.0. The format version is independent of any library or tooling version.

PropertyValue
Format versionSDIF 1.0
StatusStable
EncodingUTF-8
MIME typeapplication/sdif (proposed)
Canonical extension.sdif.canon
AI extension.sdif.ai

Conformance Language

This specification uses the key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL as defined in RFC 2119.

Scope

The SDIF 1.0 specification covers the following components:

  • Parser — tokenization, directive recognition, block structure, and AST construction
  • AST — the abstract syntax tree representing parsed SDIF documents
  • Schema validation — type-checking fields, tables, columns, relations, and rule functions against a Schema document
  • Canonical syntax v1 — the deterministic serialization form (canonical-syntax-v1) used for reproducible storage and hashing
  • AI projection — the compact, alias-enriched .sdif.ai form intended for language-model consumption

Profiles

SDIF defines three distinct profiles, each with its own directive header and intended use:

ProfileDirectiveExtensionPurpose
Source@sdif 1.0.sdifHuman-authored, may include comments and flexible whitespace
Canonical@sdif 1.0 (with canonical serialization).sdif.canonDeterministic, hashable, machine-produced
AI@sdif.ai 1.0.sdif.aiCompact, alias-enriched projection for language models

All three profiles share the same fundamental grammar. Canonical and AI profiles impose additional constraints defined in their respective sections.

Specification Pages

  • Lexical Structure — encoding, line endings, separators, comments, identifiers
  • Directives@sdif, @sdif.ai, and @profile directives
  • Document Modelkind, fields, tables, relations, rules, narratives
  • Scalar Values — string, integer, float, boolean, null, date, datetime
  • Tables — HTAB-delimited table syntax and column types
  • Relations — triple-style relation syntax
  • Schemas — Schema kind and validation model
  • Canonicalizationcanonical-syntax-v1 contract and pipeline
  • AI Projection.sdif.ai format and round-trip requirements
  • Conformance — test fixture layout and conformance requirements
  • Security — threat model and safe-handling guidance