SPEC 004 - Component Flows
Status
- Status: Working specification of the current component flows
- Scope: OntoBDC core component flows exposed through the CLI
- Primary sources:
docs/documentation/use_casedocs/documentation/spec/SPEC001_core_modules_and_commands.md
1. Purpose
This specification describes the operational flows of each OntoBDC core component.
The goal is not to restate every command option. Instead, this document focuses on:
- what flow each component owns
- which states or checkpoints matter in that flow
- how the user and the system interact
- how one component hands work to another
This document complements:
SPEC001 describes the command surface and options. SPEC004 describes the runtime flow of each component.
2. Flow Design Principles
Across the current core, the main design principles are:
- components own distinct operational responsibilities
- initialization is separated from execution
- validation can be invoked directly or as a gate before other actions
- package metadata, source data, and execution logic are layered
- some flows are command-driven while others are capability-driven
The current user-facing flow components are:
initcheckrunliststoragedeva3
3. Flow Overview
3.1 Bootstrap Flow
Owned by:
init
Purpose:
- establish the minimum local OntoBDC structure required for later operations
3.2 Validation Flow
Owned by:
check
Purpose:
- validate whether the environment, engine, and dependencies are operational
3.3 Capability Execution Flow
Owned by:
run
Purpose:
- discover capabilities, resolve execution context, and execute the selected capability
3.4 Capability Discovery Flow
Owned by:
list
Purpose:
- expose the current catalog of available capabilities and their metadata
3.5 Dataset Registration Flow
Owned by:
storage
Purpose:
- register, list, and remove dataset roots in the local storage index
3.6 Developer Workspace Flow
Owned by:
dev
Purpose:
- coordinate repository-oriented developer actions and local dev configuration
3.7 A3 Lifecycle Flow
Owned by:
a3
Purpose:
- ingest a source into an A3 lifecycle package and advance it through the A3 state machine
4. Component Flow Specifications
4.1 init Component Flow
Intent
The init component creates the local OntoBDC working structure for a project.
Main Flow
- The user runs
ontobdc initorontobdc init <engine>. - The system checks whether
.__ontobdc__/config.yamlalready exists. - If configuration already exists, the system blocks duplicate initialization.
- If no engine was provided, the system tries to infer one from the environment.
- The system writes the selected engine and default configuration into
config.yaml. - The system runs post-initialization checks to verify readiness.
- The project becomes available to the remaining components.
Output
- initialized local configuration
4.2 check Component Flow
Intent
The check component validates whether the current project environment is safe and usable.
Main Flow
- The user or another component invokes
ontobdc check. - The system resolves project configuration and engine.
- The system loads the enabled checks from configuration.
- The system executes the selected checks sequentially.
- Each check returns a status, and optionally a repair path.
- The system aggregates the results and prints the final status.
Repair Flow
- The user passes
--repair. - Failing checks that expose repair logic attempt remediation.
- The system reruns or reports the repaired state accordingly.
Output
- environment status
- failure points
- optional repair actions
4.3 run Component Flow
Intent
The run component is the main execution flow of OntoBDC capabilities.
Main Flow
- The user runs
ontobdc runwith a capability identifier or selection parameters. - The system resolves repository and CLI context parameters.
- The system loads capability packages and collects capability metadata.
- The system applies filters, pagination, and export preferences.
- If
--idis present, the system resolves that capability directly. - If no explicit target is given, the system presents the capability selection flow.
- The selected capability executes with the resolved context.
- The system renders the result using the appropriate export strategy.
Failure Flow
- The user provides an unknown capability identifier.
- The system fails during capability resolution.
- The system reports that the capability was not found.
Output
- executed capability result
- rendered output in terminal, JSON, or another selected format
4.4 list Component Flow
Intent
The list component exposes the discovery flow without executing capabilities.
Main Flow
- The user runs
ontobdc list. - The system scans the configured capability packages.
- The system loads capability metadata.
- The system deduplicates and normalizes the catalog view.
- The system renders the result either as rich cards or JSON.
Output
- capability catalog
- capability metadata for later use by
run
4.5 storage Component Flow
Intent
The storage component manages the local dataset registration flow, the root storage index, and the integrity of container-local metadata.
Main Flow (List)
- The user runs
ontobdc storage --listorontobdc storage -l. - The system checks whether the storage extra dependencies are installed and
.__ontobdc__/storage.ttlexists. - If no storage index exists or dependencies are missing, the system warns that storage has not been enabled.
- If the storage index exists, the system parses the RDF graph and lists the registered containers.
Enablement Flow
- The user runs
ontobdc storage --enable. - The system installs the required storage dependencies (
ontobdc[storage]). - The system creates the storage index
.__ontobdc__/storage.ttlwhen necessary. - The system initializes the root storage container metadata.
- The updated storage metadata is persisted.
Container Creation Flow
- The user runs
ontobdc storage --create <path>. - The system normalizes the target path relative to the project root.
- The system loads the root
.__ontobdc__/storage.ttl. - The system creates and persists a new container description in the root graph.
- The system creates
<path>/.__ontobdc__/storage.ttlwhen necessary. - The system copies the registered container triples from the root graph into the container-local
storage.ttl. - The system creates
<path>/.__ontobdc__/ro-crate-metadata.jsonwhen necessary. - The system refreshes the container RO-Crate metadata so that the local metadata file is up to date.
Storage Integrity Check Flow
The current storage-specific checks are owned by storage/plugin/check.
has_container_config_file/check.py- validates that each registered container has its local
.__ontobdc__directory andstorage.ttl is_root_set/check.py- validates that the root storage graph contains the
::ROOT::container is_crate_healthy/check.py- validates that each container has a readable
ro-crate-metadata.json
These checks are intentionally scoped:
- root validation is isolated from child-container validity
- container-config validation is isolated from full graph triple equality
- RO-Crate validation is isolated from RDF graph semantics
Storage Repair Flow
- When a storage check exposes
hotfix.py, repair recreates only the missing or stale artifact of that check. has_container_config_file/hotfix.py- recreates missing container config directories and container
storage.ttl - synchronizes root graph triples into container-local
storage.ttl is_root_set/hotfix.py- recreates the root
storage.ttlif missing - ensures the
::ROOT::container exists is_crate_healthy/hotfix.py- recreates missing
ro-crate-metadata.json - refreshes the crate metadata using the container directory as write target
- excludes internal metadata files such as
storage.ttlfrom the crate file listing
Output
- storage catalog
- updated dataset registration state
- repaired container-local metadata state
4.6 dev Component Flow
Intent
The dev component coordinates developer-oriented repository workflows.
Enablement Flow
- The user runs
ontobdc dev --enable-dev-tool. - The system writes
dev.tool: enabledinto local config. - Protected developer flows become available for the project.
Commit Flow
- The user runs
ontobdc dev commit "<message>". - The system checks whether the dev tool is enabled.
- The system validates the semantic commit message.
- The system delegates to the repository commit script.
Branch Flow
- The user runs
ontobdc dev branch. - The system checks whether the dev tool is enabled.
- The system delegates to the branch script.
- The script inspects branch state across repositories.
Checkout Flow
- The user runs
ontobdc dev checkout <name>. - The system verifies dev enablement.
- The system delegates checkout across the configured repositories.
SSH Key Flow
- The user runs
ontobdc dev repo --add-ssh-key <path>or--rm-ssh-key. - The system updates the local SSH key configuration.
Output
- updated local developer config
- repository state changes
- developer workflow feedback
4.7 a3 Component Flow
Intent
The a3 component owns the lifecycle flow for A3 packages.
It has two main operational flows:
- ingestion flow
- work processing flow
Ingestion Flow
- The user runs
ontobdc a3 --etl --source <file|url>. - The system verifies that A3 is enabled.
- The system resolves an extraction strategy for the provided source.
- The system extracts and normalizes the source content.
- The system computes a deterministic package identifier.
- The system writes
raw.txtinto the lifecycle package directory.
Work Flow
- The user runs
ontobdc a3 --work. - The system lists all lifecycle packages.
- The system creates one worker per package.
- Each worker evaluates the current package state from artifacts already on disk.
- The worker initializes the A3 state machine at that physical state.
- The worker performs valid transitions until it reaches a final state or an error.
- Each successful transition writes the next artifact into the same package.
State Sequence
The canonical sequence is:
undefinedreceivedsanitizedparsedtranslatedvalidatedreasoneddispatched
A3 Work Sequence Diagram
Output
- lifecycle package artifacts
- final
event.jsonldfor dispatched packages - failure diagnostics when a package gets stuck or invalid
5. Cross-Component Relationships
The components are not isolated. Their flows compose into a larger runtime lifecycle.
5.1 Initialization Before Execution
initestablishes the local structurecheck,run,storage,dev, anda3generally assume that structure exists
5.2 Validation As Gate
checkcan be run explicitly by the userdevand initialization-related flows may invoke validation implicitly
5.3 Discovery Before Execution
listexposes the discoverable capability surfacerunuses the same discovery logic for execution
5.4 Dataset And Package Orientation
storagemanages dataset roots and storage registrationa3manages package lifecycles inside its own lifecycle areaindex.rdf,nid.rdf, anddatapackage.jsonpatterns fit into this broader package-oriented design
6. Summary
The OntoBDC core is organized around component flows rather than a single linear application pipeline.
Each component owns a distinct operational concern:
initbootstrapscheckvalidatesrunexecutes capabilitieslistexposes the capability catalogstoragemanages dataset registrationdevcoordinates repository workflowsa3runs lifecycle-based package processing
Together, these flows define the current runtime behavior of the platform at the component level.