Core model·BlueId

BlueId

A BlueId is a content address for Blue content. It lets a document say “this exact content” instead of “whatever is currently at this URL” or “row 123 in my database.”

That does not mean Blue ignores structure or meaning. A BlueId is calculated from canonical Blue content, not from raw YAML bytes. Source conveniences are preprocessed, types can be resolved, overlays can be minimized, and pure references can be expanded or collapsed without changing the semantic identity that matters for a content comparison.

This page explains the practical model: what a BlueId points to, why pure references are strict, how structural and semantic identity differ, and why BlueId is the foundation for shared types and reproducible processing.

§Location identifiers versus content identifiers

Most identifiers point to a place or an assigned record:

IdentifierWhat it usually identifiesWhat can go wrong
URLA network locationThe server changes the response or disappears.
Database idA row in one databaseThe id has no meaning outside that database.
UUIDAn assigned object identityThe id does not tell you whether the content is correct.
Human nameA label such as PersonDifferent systems use the same label for different definitions.

A BlueId identifies content. If a node provider returns content for a BlueId, the implementation can verify that the returned content matches the requested identity. If it does not match, the provider is wrong or untrusted.

This is what makes a type reference portable:

YAML
1type:
2 blueId: <PersonBlueId>

A receiving system can resolve <PersonBlueId> through any provider it trusts. The identity is not tied to one server path.

§BlueId is not stored in the document

A BlueId is derived from content. It is normally not an ordinary data field inside the same content it identifies. When you see this:

YAML
1blueId: <SomeBlueId>

that node is a pure reference to other content. It is not a field saying “my id is <SomeBlueId>.”

To identify a local document, compute its BlueId from the document. To refer to another document, use a pure reference.

That difference keeps documents self-verifying. If a document simply stored its own id as a mutable field, changing the field would not prove anything about the rest of the content. BlueId is useful because it is derived.

§Pure references are exact

A pure reference is an object with only blueId:

YAML
1blueId: <MoneyAmountBlueId>

It cannot be mixed with sibling content:

YAML
1# Invalid: this is neither a pure reference nor a clean typed value
2blueId: <MoneyAmountBlueId>
3amountMinor: 49900
4currency: PLN

If the node is a local value whose type is referenced by BlueId, use type:

YAML
1type:
2 blueId: <MoneyAmountBlueId>
3amountMinor: 49900
4currency: PLN

This rule pays off when documents are large. In the weekend package checkout, an order can embed child documents, link to anchors, and point to known types without ambiguity about which parts are references and which parts are local state.

§Structural BlueId and semantic BlueId

The Java library exposes both structural and semantic identity operations.

Structural identity addresses the node as authored after direct BlueId input normalization. It is sensitive to redundant authored content:

JAVA
1String blueId = blue.calculateBlueId(node);

Semantic identity is calculated after the source document is preprocessed, resolved, and minimized:

JAVA
1String semanticBlueId = blue.calculateSemanticBlueId(node);

Use structural identity when the exact node itself is the thing you want to address. Use semantic identity when two different source forms should compare equal because they resolve to the same minimized meaning.

For example, these two source documents may have different raw YAML text:

YAML
1# Compact source
2type: MoneyAmount
3amountMinor: 49900
4currency: PLN
YAML
1# Expanded source
2type:
3 name: Money Amount
4 amountMinor:
5 type: Integer
6 schema:
7 minimum: 0
8 currency:
9 type: Text
10 schema:
11 minLength: 3
12 maxLength: 3
13amountMinor: 49900
14currency: PLN

If they resolve to the same meaning through the same registry and preprocessing environment, semantic identity is the comparison you usually want. Raw text comparison is not enough.

§The identity pipeline

A source document moves through several views:

Snippet
1Source document
2 -> preprocessing
3 -> preprocessed document
4 -> reference expansion and type resolution
5 -> resolved view
6 -> canonical identity input
7 -> BlueId

The names matter:

ViewWhat it is for
Source documentWhat authors write. It may use blue, aliases, scalar sugar, and concise forms.
Preprocessed documentSource after baseline and declared transforms have run. blue is removed. Aliases are replaced in type positions.
Resolved viewRuntime view after type references, inherited fields, schema constraints, and overlays are merged.
Canonical identity inputDeterministic content used to calculate identity. It may be more compact than the resolved view.
Minimized overlayAuthor-facing compact form that resolves back to the same meaning.

This pipeline is why Blue can be pleasant to author without making identity depend on formatting choices.

§BlueId is based on canonical content, not YAML formatting

These authoring choices should not be treated as meaning by themselves:

  • indentation style;
  • field ordering in source files;
  • whether a scalar was written in concise form or wrapper form when the forms are equivalent;
  • whether a known type was referred to by an authoring alias that preprocesses to the same BlueId;
  • whether an inherited field was expanded for readability but minimized away for identity.

What matters is the abstract Blue node model after the appropriate identity pipeline.

That does not mean every visual change is ignored. Changing a text value changes content. Changing a canonical type description can change type identity. Changing list length or list positions changes content. Changing a schema constraint changes content. The rule is not “formatting never matters”; the rule is “identity is based on canonical Blue content, not source-file accidents.”

§Node providers make references useful

A pure reference is useful only if something can resolve it. In Java, Blue resolves { blueId: ... } through a NodeProvider.

For tests, a simple provider can be enough:

JAVA
1Blue bootstrap = new Blue();
2
3Node priceType = bootstrap.yamlToNode(
4 "name: Price\n" +
5 "amountMinor:\n" +
6 " type: Integer\n" +
7 "currency:\n" +
8 " type: Text\n"
9);
10
11BasicNodeProvider provider = new BasicNodeProvider(priceType);
12String priceTypeBlueId = provider.getBlueIdByName("Price");
13
14Blue blue = new Blue(provider);

In production, your provider may read from a database, object store, registry, IPFS bridge, or local package. The important contract is that it returns content that verifies against the requested BlueId.

A provider should not ask callers to trust that “this is probably the right type because it came from our server.” The BlueId lets the caller verify the content itself.

§Expanded and collapsed views

A document can keep a reference compact:

YAML
1customer:
2 blueId: <AliceBlueId>
3items:
4 - product:
5 blueId: <WeekendPackageBlueId>
6 quantity: 1

A tool can expand only the parts it needs:

YAML
1customer:
2 name: Alice
3 type:
4 blueId: <PersonBlueId>
5items:
6 - product:
7 blueId: <WeekendPackageBlueId>
8 quantity: 1

The expanded content is useful for display, validation, matching, or processing. The collapsed reference is useful for storage, transport, and keeping large documents readable.

Do not assume expansion is free. A BlueId can refer to a small type definition or a huge graph. A product catalog, a long timeline, or a registry package can be large. Good tools expand by path and purpose rather than blindly loading every reachable node.

§BlueId and shared vocabulary

The old integration problem is that teams share names but not meanings. BlueId changes the unit of reuse.

A partner can say:

YAML
1request:
2 type:
3 blueId: <PackageOrderRequestBlueId>

Now the receiver can compare the type identity directly. If it recognizes the BlueId, it can use its local dictionary. If it does not recognize the BlueId but can fetch the content, it can still inspect the structure. If it cannot fetch or verify it, it can reject the request deterministically.

This is stronger than agreeing that a field called request contains “some JSON matching our docs.”

§BlueId and processing

Content identity also matters after processing. A processor should not merely say “operation succeeded.” It should produce a new canonical document, emitted events, gas accounting, metadata, and an output BlueId.

That output BlueId is useful in tests and replay. If the same input document and same delivered entry produce a different output identity, something changed: the implementation, registry, preprocessing environment, processor options, or input evidence.

For Counter, a test can assert the value of /counter. A stronger test can also assert the output identity and checkpoint state.

For Weekend package checkout, exact output identity is even more valuable because many fields change across several child scopes. Snapshot identity lets you detect accidental changes in a complex flow.

§Common mistakes

Comparing raw YAML strings

YAML text is an authoring representation. Compare Blue identities or resolved/canonical snapshots when meaning matters.

Treating blueId as a normal mutable id field

A root or field object with only blueId is a pure reference. To identify the current document, compute its BlueId.

Mixing pure references with local fields

Use type: { blueId: ... } for typed local content. Use { blueId: ... } alone for pure references.

Expanding too much

A graph can be large. Expand by path and need. Do not write tools that recursively expand every reachable BlueId by default.

Forgetting the preprocessing environment

Semantic identity depends on deterministic preprocessing and registry bindings. Implementation-local aliases may be convenient, but they are not portable unless declared or part of the agreed environment.

§What to remember

BlueId gives Blue its shared vocabulary. Types, business documents, BEX programs, and processed states can all participate in content identity. That makes references verifiable, replay outputs comparable, and cross-system definitions less dependent on local names.