Skip to content

Latest commit

 

History

History
1036 lines (707 loc) · 24.3 KB

File metadata and controls

1036 lines (707 loc) · 24.3 KB

Observer Shell HOWTO

This manual explains how to use Observer's shell-oriented workflow surface for local staged verification.

It is written for people who want to stop maintaining one-off shell or Python orchestration scripts and instead describe deterministic local workflows in Observer.

This document lives in lib/shell because that is where the runnable shell examples live in this repository.

If you want to see how shell workflows now fit into the new top-level product-certification layer, also read ../../examples/product-certify/README.md after this manual.

Quick Start: First 5 Commands

If you want the fastest path to the shell workflow model, start in lib/shell/starter-pipeline/ and run:

make clean
make run
make report
cat .observer/report.default.jsonl
make verify

That sequence shows the whole artifact workflow in order:

  • clean generated state
  • run the human-readable pipeline
  • emit the canonical JSONL report
  • inspect the report directly
  • compare live behavior against the checked-in snapshot

1. What Observer Is Doing Here

In this part of Observer, you are not writing a general-purpose script.

You are declaring a deterministic workflow contract.

That contract says:

  • how cases are discovered
  • what each case is called
  • which stages run for each case
  • which artifacts each stage publishes
  • which later stages are allowed to consume those artifacts
  • what facts are extracted and asserted
  • how the result is reported canonically

The key idea is that stages do not communicate through guessed file paths or shell convention.

They communicate through explicit named artifacts.

That is the main shift from ad hoc scripting.

Product Certification Above A Workflow

The shell workflow surface is still the stage-level mechanism.

What is new is the layer above it.

Observer can now certify one product from several declared suites together.

That matters when your workflow suite is only one part of release readiness, for example:

  • one unit suite must pass
  • one workflow corpus suite must pass
  • both together define the product verdict

In that model:

  • the shell workflow stays a normal full-surface suite
  • a product.json file declares that suite as one certification stage
  • observer certify runs the ordered product stages and emits one product report
  • observer cube-product turns the product report into per-stage analytics plus one compare-index

The smallest runnable example of that lives at ../../examples/product-certify/.

That example is useful when you want to understand where a shell workflow fits in the new product-level contract.

2. What Problem This Solves

This surface is for workflows like:

  • compile a corpus file
  • certify the produced output
  • lower it to another form
  • run a produced executable
  • inspect JSON or JSONL outputs
  • assert on exit codes, stdout, stderr, extracted fields, or stage failures

It is not for:

  • arbitrary shell automation
  • host configuration management
  • remote orchestration
  • a general scripting language with mutation and loops

If you need those things, this is the wrong tool.

If you need deterministic local verification of staged artifacts, this is the right model.

3. Mental Model

An Observer shell workflow has four parts:

  1. case discovery
  2. ordered actions
  3. named artifacts
  4. explicit assertions

A good way to think about one suite item is:

  • discover cases from explicit inputs
  • for each case, run stages in source order
  • publish artifacts when a stage produces something important
  • check artifacts before downstream use
  • extract structured facts when needed
  • assert what must be true

4. Smallest Useful Shape

A typical workflow item looks like this:

module Example.

(files: "corpus" glob: "**/*.src" key: stem) forEachCase: [ :case |
	(proc: "./bin/compile" args: [
		joinPath: ["corpus", (case path)],
		joinPath: [".observer/out", (case stem)],
		joinPath: [".observer/out", (case stem), "typed_unit.jsonl"]
	] timeoutMs: 2000) ifOk: [ :compile |
		expect: (compile exit) = 0.
		publish: "typed_unit" kind: "jsonl" path: joinPath: [".observer/out", (case stem), "typed_unit.jsonl"].
	] ifFail: [ :f |
		expect: Fail msg: "compile stage failed".
	].
].

That already tells you most of the model:

  • cases come from files
  • each case gets a binding called case
  • proc: runs a local process
  • ifOk: binds the result on success
  • ifFail: binds the failure on action failure
  • expect: records assertions
  • publish: turns a concrete file path into a named artifact

5. File Structure

A full shell workflow file is UTF-8 text.

It may begin with:

module Name.

After that, it contains one or more suite items.

Comments start with ;;.

Example:

module Demo.

;; compile every corpus source file
(files: "corpus" glob: "**/*.src" key: stem) forEachCase: [ :case |
	...
].

6. How Cases Are Discovered

The current shell workflow examples use filesystem discovery.

The form is:

(files: <root> glob: <pattern> key: <field>) forEachCase: [ :case | ... ].

Example:

(files: "corpus" glob: "**/*.src" key: stem) forEachCase: [ :case |
	...
].

Meaning:

  • files: declares the discovery root
  • glob: declares which files count as cases
  • key: declares how the canonical case key is derived
  • forEachCase: runs the body once per discovered case

Current supported key fields are:

  • path
  • name
  • stem

For filesystem discovery, the bound case object exposes:

  • key: canonical case key
  • path: normalized path under the declared root
  • name: basename including extension
  • stem: basename without extension
  • ext: extension without leading dot, or empty string

Example uses:

(case path)
(case stem)
(case name)

7. Required vs Optional Discovery

Required discovery forms:

(... ) forEachCase: [ :case | ... ].
(... ) forEach: [ :testName | ... ].

Optional forms:

(... ) forEachCaseOptional: [ :case | ... ].
(... ) forEachOptional: [ :testName | ... ].

Use optional only when zero selected cases is acceptable.

Use required when empty selection should be treated as an error.

8. The Two Main Kinds Of Workflows

Observer currently supports two major full-surface patterns:

  • inventory-driven workflows using forEach: over inventory selectors
  • filesystem-driven workflows using forEachCase: over files

The shell examples in lib/shell focus on filesystem-driven workflows.

Inventory-driven full-surface example shape:

("Smoke::Version") forEach: [ :testName |
	(run: testName timeoutMs: 2000) ifOk: [ :r |
		expect: (r exit) = 0.
	] ifFail: [ :f |
		expect: Fail msg: "inventory run failed".
	].
].

Filesystem-driven shell example shape:

(files: "corpus" glob: "**/*.src" key: stem) forEachCase: [ :case |
	...
].

9. Statements You Can Write

The main statement forms are:

expect: <Predicate>.

publish: <String> kind: <ArtifactKind> path: <ValueExpr>.

(<ResultExpr>) ifOk: [ :<Binding> | <Statement>* ] ifFail: [ :<Binding> | <Statement>* ].

(<BoolExpr>) ifTrue: [ <Statement>* ] ifFalse: [ <Statement>* ].

You will use these constantly.

9.1 expect:

expect: records an assertion.

Examples:

expect: (compile exit) = 0.
expect: (run out) contains: "ok".
expect: (certificateStatus text) = "ok".
expect: Fail msg: "summary stage failed".

Important rule:

  • a failed assertion does not automatically abort the rest of the case

Observer keeps evaluating later statements unless a future surface explicitly requests early termination.

9.2 publish:

publish: creates a named artifact binding for the current case.

Example:

publish: "typed_unit" kind: "jsonl" path: joinPath: [".observer/out", (case stem), "typed_unit.jsonl"].

This is what makes later artifact reuse explicit.

Artifact names are case-local.

Re-publishing the same artifact name in one case is a runtime error.

9.3 ifOk: and ifFail:

These branch on action success or failure.

Example:

(proc: "./bin/compile" args: [...] timeoutMs: 2000) ifOk: [ :compile |
	expect: (compile exit) = 0.
] ifFail: [ :f |
	expect: Fail msg: "compile stage failed".
].

Important rule:

  • ifOk: means the action itself completed successfully as an Observer action
  • it does not mean the child process exited with code 0

For proc:, a nonzero process exit still gives you a successful action result with an exit field.

So this is correct:

(proc: "./bin/tool" args: [...] timeoutMs: 2000) ifOk: [ :r |
	expect: (r exit) = 0.
].

And this is what ifFail: is for:

  • spawn failure
  • timeout at the action boundary
  • protocol-level failure for actions that can fail structurally

9.4 ifTrue: and ifFalse:

These branch on a boolean expression.

Example:

((run out) contains: "ok") ifTrue: [
	expect: (run exit) = 0.
] ifFalse: [
	expect: Fail msg: "program output did not contain ok".
].

10. Actions

These are the core result-producing forms you are likely to use.

10.1 proc:

Runs a local executable with explicit arguments.

Shape:

proc: <String> args: <ArgArray> timeoutMs: <u32>

Example:

(proc: "./bin/emit-unit" args: [
	joinPath: ["corpus", (case path)],
	joinPath: [".observer/out", (case stem)],
	joinPath: [".observer/out", (case stem), "typed_unit.jsonl"]
] timeoutMs: 2000)

Result fields commonly used:

  • (r exit)
  • (r out)
  • (r err)

Use proc: for project-local tools and helper scripts.

10.2 run:

Runs an inventory-bound test by name.

Shape:

run: <ValueExpr> timeoutMs: <u32>

This is mostly for inventory-driven suites, not the shell starter examples.

10.3 artifactCheck:

Looks up a previously published artifact by name and kind.

Shape:

artifactCheck: <String> kind: <ArtifactKind>

Example:

(artifactCheck: "typed_unit" kind: "jsonl") ifOk: [ :typedUnit |
	...
].

This is the normal way to confirm a stage's published output exists and is available for downstream use.

10.4 extractJson: and extractJsonl:

Extract structured facts from published artifacts.

Shapes:

extractJson: <String> select: <String>
extractJsonl: <String> select: <String>

Example:

(extractJsonl: "typed_unit" select: "$.unit_id") ifOk: [ :unitId |
	expect: (unitId count) = 1.
	expect: (unitId text) = joinPath: ["unit", (case stem)].
].

Common result fields:

  • (x count)
  • (x text)

Use extraction when you want the workflow to assert on structured content rather than just file existence.

10.5 httpGet: and tcp:

These exist in the full surface but are not the main focus of the shell starter examples.

They are useful when a local workflow also needs explicit protocol checks.

11. Value Constructors And Accessors

11.1 Field Access

Use parentheses to access fields on a bound value.

Examples:

(case stem)
(compile exit)
(run out)
(unitId text)
(f kind)

11.2 artifactPath:

Turns a named artifact binding into the underlying concrete path passed to a downstream process.

Example:

artifactPath: "typed_unit"

This is one of the most important forms in the language.

It is what prevents heuristic path reconstruction.

Do this:

(proc: "./bin/certify" args: [artifactPath: "typed_unit"] timeoutMs: 2000)

Do not do this by guessing where the previous stage probably wrote its output.

11.3 joinPath:

Builds explicit paths from explicit components.

Example:

joinPath: [".observer/out", (case stem), "summary.jsonl"]

Use this for:

  • output directories
  • output file paths
  • deterministic expected path construction

11.4 Fail msg:

Constructs an explicit failure value for assertion.

Example:

expect: Fail msg: "compile stage failed".

This is useful when you want a case to fail with a clear workflow-specific message rather than only exposing a lower-level action detail.

12. Predicates And Comparisons

Current predicate vocabulary includes:

(<ValueExpr>) = <ValueExpr>
(<ValueExpr>) != <ValueExpr>
(<ValueExpr>) < <ValueExpr>
(<ValueExpr>) <= <ValueExpr>
(<ValueExpr>) > <ValueExpr>
(<ValueExpr>) >= <ValueExpr>
(<ValueExpr>) contains: <ValueExpr>
(<ValueExpr>) contains: /<Regex>/
(<ValueExpr>) startsWith: <ValueExpr>
(<ValueExpr>) endsWith: <ValueExpr>
(<ValueExpr>) match: /<Regex>/
<ValueExpr> isStatus: <Int>
<ValueExpr> isStatusClass: <Int>
<ValueExpr> hasHeader: <String>

Examples:

expect: (compile exit) = 0.
expect: (run out) contains: "ok".
expect: (f msg) contains: "No such file".
expect: (run out) match: /ok|healthy/.

13. Artifact Kinds

Common artifact kinds used in the examples are:

  • file
  • json
  • jsonl

Choose the kind that matches what the stage actually materializes.

Do not publish a JSONL file as file unless you deliberately want to avoid structured extraction.

If you intend to call extractJson: or extractJsonl:, publish the artifact with the matching structured kind.

14. Bindings And Scope

Bindings only come from explicit places.

You do not have general local variables.

Bindings are introduced by:

  • the suite item case binding, such as :case
  • ifOk: bindings, such as :compile
  • ifFail: bindings, such as :f

Examples:

(files: ... ) forEachCase: [ :case |
	(proc: ... ) ifOk: [ :compile |
		...
	] ifFail: [ :f |
		...
	].
].

Each binding is scoped to its block.

15. Ordering And Determinism

This part is not optional.

Observer is designed around these constraints:

  • case discovery order must be deterministic
  • action order in one case is exactly source order
  • artifact reuse must be explicit
  • verdicts must be mechanically derived from explicit contract data

That means:

  • no hidden shell interpolation semantics in the language
  • no implicit downstream artifact guessing
  • no depending on host filesystem iteration order
  • no free-form mutable program state in the suite

If a workflow idea depends on heuristic inference, it is probably outside the intended model.

16. What ifOk: Does And Does Not Mean

This point is important enough to isolate.

For proc::

  • action success means Observer successfully ran the process and captured a result
  • child exit code is just one field on that result

So this pattern is correct:

(proc: "./bin/tool" args: [...] timeoutMs: 2000) ifOk: [ :r |
	expect: (r exit) = 0.
] ifFail: [ :f |
	expect: Fail msg: "tool could not be run".
].

Use ifFail: for action-level failure.

Use (r exit) = 0 for subject/process-level success.

This distinction matters in real workflows.

17. Reading The Starter Examples

Recommended order:

  1. artifact_roundtrip.obs
  2. stage_failure.obs
  3. multi_artifact_pipeline.obs
  4. starter/
  5. starter-pipeline/
  6. starter-pipeline-failure/
  7. compiler_workflow.obs

Why this order:

  • the early .obs files teach isolated constructs
  • starter/ shows the smallest runnable project shape
  • starter-pipeline/ gives the main artifact-chain "aha"
  • starter-pipeline-failure/ shows a deterministic staged break
  • compiler_workflow.obs shows the larger intended destination

17.1 Passing Walkthrough: Case Discovery To Final Artifact

Use starter-pipeline/ for this walkthrough.

The main flow is:

  1. discover cases from corpus/**/*.src
  2. compile each case into a published typed_unit
  3. certify that unit into a published certificate
  4. summarize both artifacts into a published summary
  5. assert over extracted JSONL facts from those artifacts

In tests.obs, discovery begins here:

(files: "corpus" glob: "**/*.src" key: stem) forEachCase: [ :case |
	...
].

That means Observer creates deterministic cases from the filesystem first.

For each case, the first stage publishes:

publish: "typed_unit" kind: "jsonl" path: joinPath: [".observer/out", (case stem), "typed_unit.jsonl"].

The second stage does not guess that path again. It consumes the artifact contractually:

(proc: "./bin/certify-unit" args: [
	artifactPath: "typed_unit",
	...
] timeoutMs: 2000)

Then the final stage consumes both typed_unit and certificate, publishes summary, and the workflow extracts facts like:

  • $.unit_id
  • $.certificate_status
  • $.pipeline
  • $.case_key

So the shell model is not:

  • run one big script
  • hope files appear in the right places

It is:

  • discover cases deterministically
  • run explicit stages in order
  • publish named artifacts
  • consume artifacts by name
  • assert over canonical extracted facts

17.2 Failing Walkthrough: Stage Failure Without Model Change

Use starter-pipeline-failure/ immediately after the passing starter.

The key failure is deterministic:

  • alpha has bin/alpha/certify-unit
  • beta does not have bin/beta/certify-unit

The failing stage is written as:

(proc: joinPath: ["./bin", (case stem), "certify-unit"] args: [
	artifactPath: "typed_unit",
	...
] timeoutMs: 2000) ifOk: [ :certify |
	...
] ifFail: [ :f |
	expect: (f kind) = "spawn".
	expect: Fail msg: "certify stage failed".
].

This is the important point:

  • the workflow shape did not change
  • case discovery did not change
  • artifact publication rules did not change
  • only one stage for one case failed structurally

So the model remains deterministic.

Observer records that beta reached the certify stage and that the stage failed with a spawn failure. Downstream summary generation does not run for that case, because the artifact contract for the missing certificate was never satisfied.

That is the shell equivalent of the C distinction between transport success and test failure: you must separate action failure from normal stage outcome.

18. How To Start A New Workflow In Practice

A practical workflow-authoring process looks like this.

Step 1: Choose the case source

Usually this is a corpus directory.

Example:

(files: "corpus" glob: "**/*.src" key: stem) forEachCase: [ :case |
	...
].

Step 2: Make each stage a real local tool

Keep stage logic in project-local executables or scripts.

Examples:

  • ./bin/compile
  • ./bin/certify
  • ./bin/summarize

Observer should orchestrate these tools, not replace them.

Step 3: Publish stage outputs explicitly

After a stage succeeds, publish what matters.

Example:

publish: "typed_unit" kind: "jsonl" path: joinPath: [".observer/out", (case stem), "typed_unit.jsonl"].

Step 4: Consume artifacts by name

Do not reconstruct paths heuristically later.

Example:

(proc: "./bin/certify" args: [artifactPath: "typed_unit"] timeoutMs: 2000)

Step 5: Extract structured facts when they matter

Example:

(extractJsonl: "typed_unit" select: "$.unit_id") ifOk: [ :unitId |
	expect: (unitId count) = 1.
].

Step 6: Separate action failure from workflow failure

Example:

(proc: "./bin/certify" args: [artifactPath: "typed_unit"] timeoutMs: 2000) ifOk: [ :cert |
	expect: (cert exit) = 0.
] ifFail: [ :f |
	expect: Fail msg: "certify stage failed".
].

Step 7: Freeze the canonical output

Use a checked-in report snapshot.

The starter examples show this pattern with:

  • make report
  • expected.default.jsonl
  • make verify

19. A More Complete Example

This shape is representative of a small multi-stage pipeline:

module StarterPipeline.

(files: "corpus" glob: "**/*.src" key: stem) forEachCase: [ :case |
	(proc: "./bin/emit-unit" args: [
		joinPath: ["corpus", (case path)],
		joinPath: [".observer/out", (case stem)],
		joinPath: [".observer/out", (case stem), "typed_unit.jsonl"]
	] timeoutMs: 2000) ifOk: [ :emitUnit |
		expect: (emitUnit exit) = 0.
		publish: "typed_unit" kind: "jsonl" path: joinPath: [".observer/out", (case stem), "typed_unit.jsonl"].

		(artifactCheck: "typed_unit" kind: "jsonl") ifOk: [ :typedUnit |
			(extractJsonl: "typed_unit" select: "$.unit_id") ifOk: [ :unitId |
				expect: (unitId count) = 1.
				expect: (unitId text) = joinPath: ["unit", (case stem)].
			].

			(proc: "./bin/certify-unit" args: [
				artifactPath: "typed_unit",
				joinPath: [".observer/out", (case stem)],
				joinPath: [".observer/out", (case stem), "certificate.jsonl"]
			] timeoutMs: 2000) ifOk: [ :certify |
				expect: (certify exit) = 0.
				publish: "certificate" kind: "jsonl" path: joinPath: [".observer/out", (case stem), "certificate.jsonl"].
			].
		].
	].
].

Use the runnable version in starter-pipeline/ as the real reference.

20. Common Mistakes

Mistake 1: guessing artifact paths downstream

Bad:

(proc: "./bin/certify" args: [joinPath: [".observer/out", (case stem), "typed_unit.jsonl"]] timeoutMs: 2000)

Better:

(proc: "./bin/certify" args: [artifactPath: "typed_unit"] timeoutMs: 2000)

Mistake 2: assuming ifOk: means exit code zero

Bad:

(proc: "./bin/tool" args: [...] timeoutMs: 2000) ifFail: [ :f |
	;; expecting normal process exit handling here
].

Better:

(proc: "./bin/tool" args: [...] timeoutMs: 2000) ifOk: [ :r |
	expect: (r exit) = 0.
].

Mistake 3: hiding workflow semantics inside one giant shell script

If one shell script internally discovers cases, computes paths, runs all stages, and decides pass or fail, Observer sees too little.

Push stage logic into small tools, and let Observer own:

  • case discovery
  • stage ordering
  • artifact publication
  • structured checks
  • final verdicts

Mistake 4: using the wrong artifact kind

If you want to extract JSONL fields, publish as jsonl, not generic file.

Mistake 5: depending on undeclared ambient state

If a stage depends on some file, directory, or tool, make that dependency explicit in the workflow or local project layout.

21. How To Run The Examples

From a starter directory such as lib/shell/starter-pipeline:

make run
make report
make verify
make clean

Typical meanings:

  • make run: human-readable console flow
  • make report: canonical JSONL report
  • make verify: compare against checked-in snapshot
  • make clean: remove generated .observer output

22. Keyword And Form Reference

This is the quick lookup section.

Top-level and item keywords

  • module: optional module header
  • forEach: required inventory-driven iteration
  • forEachOptional: optional inventory-driven iteration
  • files:: filesystem discovery root
  • glob:: filesystem discovery pattern
  • key:: case-key derivation field
  • forEachCase: required filesystem-driven iteration
  • forEachCaseOptional: optional filesystem-driven iteration

Statement keywords

  • expect:: record an assertion
  • publish:: publish a named artifact
  • ifOk:: success branch for a result expression
  • ifFail:: failure branch for a result expression
  • ifTrue:: true branch for a boolean expression
  • ifFalse:: false branch for a boolean expression

Action keywords

  • run:: execute an inventory test
  • proc:: execute a local process
  • httpGet:: perform an HTTP GET
  • tcp:: perform a TCP probe
  • artifactCheck:: look up a named artifact
  • extractJson:: extract from a JSON artifact
  • extractJsonl:: extract from a JSONL artifact

Value helpers

  • artifactPath:: resolve a named artifact to a path value
  • joinPath:: construct a path from explicit parts
  • Fail msg:: explicit failure value for assertion

Predicate vocabulary

  • =
  • !=
  • <
  • <=
  • >
  • >=
  • contains:
  • startsWith:
  • endsWith:
  • match:
  • isStatus:
  • isStatusClass:
  • hasHeader:

23. Current Limits

The current full surface does not provide:

  • user-defined functions
  • mutation
  • general local variables
  • arbitrary loops
  • implicit shell pipelines as language constructs
  • heuristic path inference
  • hidden file IO primitives inside the suite language

That is deliberate.

The point is to keep workflows explicit, deterministic, and mechanically derivable.

24. Where To Look Next

Use these in order:

  • lib/shell/README.md for the example index
  • lib/shell/starter/ for the smallest runnable example
  • lib/shell/starter-pipeline/ for the main passing artifact pipeline
  • lib/shell/starter-pipeline-failure/ for the failing companion
  • lib/shell/compiler_workflow.obs for the larger target shape
  • specs/30-suite.md for the suite surface definition
  • specs/50-workflow-verification.md for the workflow model and constraints

If you keep one rule in mind, keep this one:

Publish artifacts explicitly, then consume them by name.