Skip to main content

Tables

Tables are the primary mechanism for structured, repeating data in SDIF. They are compact by design: the column header is declared once, rows follow with tab-separated values, and no object delimiters, brackets, or closing tags are needed.

Declaration Syntax

name[col1,col2,col3]:

A table declaration consists of:

  • The table name (a bare identifier, lowercase with optional underscores)
  • A bracketed, comma-separated column list
  • A colon terminating the header

Immediately following the header, each row is indented with two spaces and its values are separated by a literal HORIZONTAL TAB character (U+0009).

milestones[id,status,gate,evidence]:
R1 done validate-syntax reports/syntax.md
R2 done validate-canonical reports/canonical.md
R3 pending validate-schema reports/schema.md
R4 pending validate-semantics reports/semantics.md

The Tab Separator

Column values are separated by a literal tab character — U+0009. This is not a display convention or a suggestion; it is a hard format requirement. Parsers must not treat sequences of spaces as column delimiters.

When reading SDIF files in editors, enabling visible whitespace or "Show invisibles" makes tab characters visible as arrows or glyphs.

In source code and fixtures, embed actual tab bytes between values. Do not use \t escape sequences in raw SDIF text unless your toolchain explicitly handles them.

Indentation

In canonical form, rows are indented with exactly two spaces before the first column value. Source documents may use any consistent indentation, but two spaces is the canonical contract.

Quoted Values

Column values containing spaces, tabs, or special characters must be double-quoted:

tasks[id,label,assignee]:
T1 "Deploy to staging" alice
T2 "Run smoke tests" bob

Unquoted values may not contain whitespace. Parsers treat the first unescaped tab as the column boundary.

Empty Cells

An empty cell is represented by two adjacent tab characters (or a tab at the start of a row for the first column, or a trailing tab before end-of-line for the last column):

items[id,value,note]:
A1 42
A2 "no value"
A3 99 verified

In the example above, A1 has an empty note, and A2 has an empty value.

Mixed Types in Columns

SDIF columns are typed by schema. In source documents without a schema, column values are untyped strings. A schema may declare a column as String, Integer, Identifier, Enum(...), or other supported types. The parser validates values against the declared type during schema-aware parsing.

When no schema is present, all values are treated as strings.

Primary Keys

A schema may declare a primary key for a table:

tables[name,ordered,primary_key]:
milestones false id

The primary_key column names the column (or columns) that uniquely identify each row. Primary keys are used during canonicalization to sort rows in unordered tables, and by validators to detect duplicate entries.

Ordered vs Unordered Tables

Tables have two modes, declared in the schema:

ModeorderedRow handling
UnorderedfalseRows sorted by primary key in canonical form
OrderedtrueRow order preserved exactly as written

Unordered tables are the common case. When ordered=false, canonicalization sorts rows by their primary key. This makes the canonical form independent of insertion order.

Ordered tables preserve the sequence of rows as authored. Use ordered tables when row position carries meaning — for example, a ranked list, a time-ordered event log, or an explicit sequence of steps.

When no schema is available, parsers preserve row order (treat as ordered).

Schema Column Definition

Columns are declared in the schema under the columns table, referencing the parent table by name:

columns[table,name,type,required]:
milestones id Identifier true
milestones status Enum(done,pending,blocked) true
milestones gate String true
milestones evidence String false

Each row names the table it belongs to, the column name, the value type, and whether the column is required. Required columns must have a non-empty value in every row.

Full Example

The following is the milestones table from a Plan document:

milestones[id,status,gate,evidence]:
R1 done validate-syntax reports/syntax.md
R2 done validate-canonical reports/canonical.md
R3 pending validate-schema reports/schema.md
R4 pending validate-semantics reports/semantics.md

With the schema declaring milestones as ordered=false and primary_key=id, the canonical form sorts these rows lexicographically by id — which in this case is already in order.

Common Mistakes

  • Using spaces instead of tabs between columns. A sequence of spaces between values is not a column separator. Only U+0009 counts.
  • Forgetting the colon after the column header. milestones[id,status] without the trailing : is not a valid table declaration.
  • Quoting identifiers unnecessarily. If a value contains no spaces or special characters, it should be unquoted. Over-quoting is not a parse error, but it reduces readability.