Skip to content

HTML Storage Model

This document defines the HTML storage model used by WildEditorInChief (WEIC).

WEIC stores document content in HTML as its canonical internal format. HTML is the authoritative representation of document body content. Markdown may appear as an import or publication format in some workflows, but it is not the primary storage format inside WEIC.

This decision aligns WEIC with the long-term goals of structured authoring, controlled rendering, fragment reuse, revision management, and flexible export.


Purpose

The HTML storage model exists to support several goals:

  • preserve document content in a rich, structured format
  • support controlled editing and rendering
  • allow fragment reuse within HTML documents
  • enable revision storage without lossy format conversion
  • support downstream export to other formats when required

The key principle is that WEIC should store the document as it is authored rather than converting it into a weaker intermediate representation.


Canonical document format

The canonical body format for WEIC documents is HTML.

This means:

  • document body content is stored as HTML
  • revisions preserve HTML body content
  • rendering is based on stored HTML
  • exports begin from HTML rather than from Markdown

Markdown is not the canonical format.

Markdown may still appear in some contexts, such as:

  • bootstrap documentation repositories
  • imports from external sources
  • publication workflows
  • conversion pipelines

However, once content is stored in WEIC, the authoritative form is HTML.


Why HTML

HTML was selected as the canonical format because it supports the needs of a controlled documentation system better than plain Markdown.

Advantages include:

  • richer document structure
  • precise control over formatting and semantics
  • better support for embedded structure and reusable fragments
  • easier round-trip editing in browser-based editors
  • more direct rendering and export behavior

This model also aligns naturally with editor technologies that already produce structured HTML output.


Document structure

A WEIC document consists of structured metadata and body content.

Typical document elements include:

  • document identifier
  • title
  • slug
  • hierarchy placement
  • ownership metadata
  • status metadata
  • current revision reference
  • HTML body content stored in revisions

The document record itself should contain document identity and placement metadata, while content belongs to revisions.

This separation keeps document identity stable while allowing content to evolve over time.


Revision storage

WEIC uses revision-based storage.

Each content change creates a new revision record. A revision stores the document body in HTML along with revision metadata.

Typical revision elements include:

  • revision identifier
  • document identifier
  • created timestamp
  • author
  • revision notes, where applicable
  • HTML body content

This model provides:

  • historical traceability
  • safe editing
  • rollback capability
  • comparison between revisions
  • controlled publication of approved content

The current document state is represented by the current revision reference rather than by mutating a single body field in place.


Fragment model

WEIC supports reusable fragments that can be inserted into documents.

Fragments are also stored as HTML.

This allows fragments to preserve rich structure and formatting without conversion.

Typical fragment properties include:

  • fragment identifier
  • fragment name
  • current revision reference
  • HTML body content in fragment revisions

A fragment may represent:

  • standardized language
  • reusable policy text
  • common operational instructions
  • shared document sections

The HTML storage model makes fragment reuse predictable because inserted content already exists in the same canonical format as the parent document.


Fragment inclusion behavior

Fragment use should remain explicit.

A document may:

  • embed fragment content directly at render time
  • store fragment references for later expansion
  • support controlled materialization for export or publication workflows

The important architectural point is that fragments and documents share the same canonical body format.

This avoids the complexity of mixing Markdown fragments into HTML documents or vice versa.


Rendering model

Rendering in WEIC begins from stored HTML.

Typical rendering paths include:

  • editing in an HTML-capable editor
  • displaying rendered document content in the application
  • publishing HTML output
  • transforming HTML to other target formats

Because HTML is already the stored format, rendering does not require an intermediate conversion step from Markdown.

This reduces ambiguity and preserves structure more reliably.


Editing model

WEIC editing is expected to operate on structured HTML content.

This does not mean users edit raw HTML directly in all cases. In practice, an editor may provide a rich text or structured interface while still producing HTML as the persisted result.

This approach supports:

  • browser-based editing
  • consistent round-trip behavior
  • preservation of document structure
  • controlled sanitization and validation

The persisted output remains HTML even if the editing interface abstracts the markup from the user.


Export model

Exports should begin from canonical HTML content.

Possible export targets include:

  • HTML publication
  • PDF generation
  • DOCX generation
  • Markdown export where required

Because HTML is the canonical source, exports are treated as derived formats rather than as the authoritative record.

This keeps the storage model simple:

authoring
   ↓
HTML revision storage
   ↓
rendering and export

Rather than:

authoring
   ↓
Markdown storage
   ↓
HTML conversion
   ↓
export conversion

The HTML-first model reduces unnecessary conversion chains.


Relationship to EIC

EditorInChief (EIC) was the earlier knowledge editor that preceded WEIC.

Work in EIC was originally designed to organize and manage policies and gather evidence for HITRUST certification.

WEIC extends that model into a broader documentation and knowledge platform.

The HTML storage decision reflects lessons from that evolution:

  • document structure matters
  • revision fidelity matters
  • reusable content matters
  • rendered output matters

Using HTML as the canonical format supports these requirements more directly than Markdown.


Search and indexing implications

Because HTML is the canonical body format, indexing systems should extract searchable text from sanitized HTML content.

This means search pipelines should:

  • parse stored HTML
  • extract text content
  • preserve structural cues where useful
  • index metadata and body content separately where appropriate

Search should not rely on the existence of Markdown source.

The system should treat HTML as the source of truth for indexing and retrieval.


Sanitization and validation

Because HTML is stored directly, WEIC must apply controlled sanitization and validation rules.

Typical goals include:

  • remove disallowed markup
  • preserve approved structural elements
  • prevent unsafe script or embedded content
  • enforce predictable document structure

The storage model assumes HTML is controlled, not arbitrary.

This is an important part of making HTML safe and reliable as a canonical format.


Design principles

The HTML storage model follows several principles.

Canonical richness

Store the document in a format rich enough to preserve real structure.

Revision fidelity

Revisions should preserve authored content without unnecessary conversion loss.

Format separation

Canonical storage and export formats are not the same thing.

Fragment consistency

Documents and fragments should share the same storage model.

Controlled rendering

Stored content should be renderable in predictable ways across editing, viewing, and publication.


Relationship to the Oryvin plan

WEIC acts as the knowledge core of the Oryvin ecosystem. The HTML storage model defines how that knowledge is stored internally.

authored knowledge
        ↓
HTML document and fragment revisions
        ↓
rendering, publication, and export
        ↓
automation and operational use

This decision affects document governance, publication, fragment reuse, and long-term system evolution.