Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -209,3 +209,5 @@ __marimo__/
# Living Doc Toolkit specific
outputs/
.DS_Store
doc-issues.json
pdf_ready.json
2 changes: 1 addition & 1 deletion apps/cli/src/living_doc_cli/commands/normalize_issues.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def format_error_message(error: ToolkitError) -> str:
# Add actionable guidance based on error type
guidance_map = {
InvalidInputError: "Ensure --input points to a valid file.",
AdapterError: "Check metadata.generator.name field.",
AdapterError: "Check metadata.producer.name field.",
SchemaValidationError: "Review the output schema requirements.",
NormalizationError: "Check input data format and content.",
FileIOError: "Ensure output directory exists and is writable.",
Expand Down
2 changes: 1 addition & 1 deletion apps/cli/tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ def test_normalize_issues_adapter_error(mock_run_service, runner):
assert result.exit_code == 2
assert "Adapter error:" in result.output
assert "No compatible adapter found for input" in result.output
assert "Check metadata.generator.name field" in result.output
assert "Check metadata.producer.name field" in result.output


@patch("living_doc_cli.commands.normalize_issues.run_service")
Expand Down
6 changes: 3 additions & 3 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ flowchart TD
Start([Start]) --> Load[Load Input JSON]
Load --> Detect{Auto-detect<br/>Adapter?}

Detect -->|Yes| AutoDetect[Scan metadata.generator.name]
Detect -->|Yes| AutoDetect[Scan metadata.producer.name]
Detect -->|No| ExplicitAdapter[Use --source adapter]

AutoDetect --> CheckAdapter{Adapter<br/>Found?}
Expand Down Expand Up @@ -285,7 +285,7 @@ sequenceDiagram
alt Auto-detect mode
Service->>Registry: Find compatible adapter
Registry->>Adapter: can_handle(payload)?
Adapter->>Registry: Yes (metadata.generator.name matches)
Adapter->>Registry: Yes (metadata.producer.name matches)
Registry->>Service: Return CollectorGhAdapter
else Explicit mode
Service->>Registry: Get adapter by name
Expand Down Expand Up @@ -402,7 +402,7 @@ flowchart LR
{
"code": "VERSION_MISMATCH",
"message": "Producer version 2.1.0 is outside confirmed range",
"context": "metadata.generator.version"
"context": "metadata.producer.version"
}
]
}
Expand Down
8 changes: 4 additions & 4 deletions docs/contracts.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ Produced by [living-doc-collector-gh](https://github.com/AbsaOSS/living-doc-coll
### Producer Detection

Adapter auto-detection checks:
- `metadata.generator.name` == `"AbsaOSS/living-doc-collector-gh"`
- `metadata.generator.version` — semver format
- `metadata.producer.name` == `"AbsaOSS/living-doc-collector-gh"`
- `metadata.producer.version` — semver format

### Compatibility Policy

Expand All @@ -67,7 +67,7 @@ Warning format in audit:
{
"code": "VERSION_MISMATCH",
"message": "Producer version 2.1.0 is outside confirmed range >=1.0.0,<2.0.0",
"context": "metadata.generator.version"
"context": "metadata.producer.version"
}
```

Expand Down Expand Up @@ -158,7 +158,7 @@ Each pipeline stage appends a trace entry:

| Collector field | Audit field |
|-----------------|-------------|
| `metadata.generator.*` | `audit.producer.*` |
| `metadata.producer.*` | `audit.producer.*` |
| `metadata.run.*` | `audit.run.*` |
| `metadata.source.*` | `audit.source.*` |
| Full original `metadata` | `audit.extensions["collector-gh"].original_metadata` |
Expand Down
20 changes: 10 additions & 10 deletions docs/cookbooks/normalize-issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,16 +59,16 @@ See [Contracts & Interfaces](../contracts.md#cli-interface) for the full argumen

### Auto-Detection

When `--source auto` is used (default), the service automatically detects the producer by examining the `metadata.generator.name` field in the input JSON:
When `--source auto` is used (default), the service automatically detects the producer by examining the `metadata.producer.name` field in the input JSON:

```python
if payload["metadata"]["generator"]["name"] == "AbsaOSS/living-doc-collector-gh":
adapter = CollectorGhAdapter()
```

**Required Fields for Detection:**
- `metadata.generator.name` — Producer identifier (e.g., `"AbsaOSS/living-doc-collector-gh"`)
- `metadata.generator.version` — Producer version (semver format, e.g., `"1.2.0"`)
- `metadata.producer.name` — Producer identifier (e.g., `"AbsaOSS/living-doc-collector-gh"`)
- `metadata.producer.version` — Producer version (semver format, e.g., `"1.2.0"`)

### Explicit Adapter Selection

Expand Down Expand Up @@ -122,7 +122,7 @@ This policy ensures:
The adapter maps collector metadata to the audit envelope:

```
metadata.generator.* → audit.producer.*
metadata.producer.* → audit.producer.*
metadata.run.* → audit.run.*
metadata.source.* → audit.source.*
```
Expand Down Expand Up @@ -202,7 +202,7 @@ When the producer version is outside the confirmed range, a warning is logged an
{
"code": "VERSION_MISMATCH",
"message": "Producer version 2.1.0 is outside confirmed range >=1.0.0,<2.0.0",
"context": "metadata.generator.version"
"context": "metadata.producer.version"
}
```

Expand All @@ -229,7 +229,7 @@ When the producer version is outside the confirmed range, a warning is logged an
**Common Causes:**
- File not found: `--input` path does not exist
- Malformed JSON: Syntax errors in input file
- Missing required fields: Input lacks `metadata.generator.name`
- Missing required fields: Input lacks `metadata.producer.name`

**Example:**
```
Expand All @@ -248,16 +248,16 @@ Invalid input: File 'doc-issues.json' not found. Ensure --input points to a vali
**Error Prefix:** `Adapter error:`

**Common Causes:**
- No compatible adapter found: `metadata.generator.name` does not match any known producer
- Missing metadata: Input lacks `metadata.generator` section
- No compatible adapter found: `metadata.producer.name` does not match any known producer
- Missing metadata: Input lacks `metadata.producer` section

**Example:**
```
Adapter error: No compatible adapter found for input. Check metadata.generator.name field.
Adapter error: No compatible adapter found for input. Check metadata.producer.name field.
```

**Solutions:**
- Inspect `metadata.generator.name`: `jq .metadata.generator.name doc-issues.json`
- Inspect `metadata.producer.name`: `jq .metadata.producer.name doc-issues.json`
- Verify input is from `AbsaOSS/living-doc-collector-gh`
- Use `--source collector-gh` to explicitly select adapter

Expand Down
4 changes: 2 additions & 2 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ Invalid input: Malformed JSON in 'doc-issues.json'. Ensure the file contains val

**Message:**
```
Invalid input: Missing required field 'metadata.generator.name'. Check input structure.
Invalid input: Missing required field 'metadata.producer.name'. Check input structure.
```

**Causes:**
Expand All @@ -144,7 +144,7 @@ Invalid input: Missing required field 'metadata.generator.name'. Check input str

2. **Verify generator metadata:**
```bash
jq .metadata.generator doc-issues.json
jq .metadata.producer doc-issues.json
```

Expected output:
Expand Down
187 changes: 187 additions & 0 deletions packages/adapters/collector_gh/SCHEMA_SYNC.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# Schema Synchronization Guide

## Pattern: Pydantic-First (Schema Producer / Data Consumer)

This adapter uses the **Pydantic-First** pattern where **this repository** (living-doc-toolkit):
- **Receives data** from collector-gh (data consumer role)
- **Produces schema** as an artifact for collector-gh to validate against (schema producer role)

The Pydantic models in this repo are the **single source of truth** for the input contract.

```
┌────────────────────────────────────────────────┐
│ living-doc-toolkit (This Repo) │
│ SCHEMA PRODUCER / DATA CONSUMER │
│ │
│ • Pydantic models (models.py) ◄── SOURCE │
│ • Export JSON Schema (schema_export.py) │
│ • Save to: schemas/doc-issues-v1.0.0-schema.json │
│ • Publish schema as artifact │
└────────────────────────────────────────────────┘
│ Schema published as independent artifact
│ (no direct code dependency)
┌────────────────────────────────────────────────┐
│ Downstream Consumers (Independent) │
│ SCHEMA CONSUMER / DATA PRODUCER │
│ │
│ • Obtain published schema │
│ • Use it independently for validation │
│ • Publishes validated data │
└────────────────────────────────────────────────┘
```

**Key:** No direct code dependency. The schema is a published artifact that each
repo uses independently within their own validation pipeline.


## Schema Version

- **Input Schema Version:** `1.0.0` (independent of adapter package version)
- **Adapter Package Version:** `1.0.0` (see `__init__.py`)
- **Producer Compatibility Range:** `>=1.0.0,<2.0.0` (see `compatibility.py`)

## Workflow: When Pydantic Models Change

### 1. Consumer (living-doc-toolkit) Updates Model

Edit [models.py](src/living_doc_adapter_collector_gh/models.py):

```python
class AdapterMetadataSource(BaseModel):
"""Source information for adapter metadata."""
systems: list[str] = Field(min_length=1, description="At least one system")
# ... other fields
```

### 2. Export Updated Schema

Schema is automatically saved with version in filename:

```bash
# From packages/adapters/collector_gh/
python -m living_doc_adapter_collector_gh.schema_export

# Schema is now in: schemas/doc-issues-v1.0.0-schema.json

# Or programmatically:
from living_doc_adapter_collector_gh import export_schema, SCHEMA_VERSION
schema = export_schema() # Saved to default location with version
print(f"Schema version: {SCHEMA_VERSION}") # 1.0.0
```

Or save to custom location:

```bash
python -m living_doc_adapter_collector_gh.schema_export /path/to/custom-schema.json
```

### 3. Validate Tests Pass

```bash
make pytest-unit-packages/adapters/collector_gh
```

### 4. Commit & Publish Schema as Artifact

Schema changes are committed and published with version in filename:

```bash
# Commit the updated schema (versioned filename)
git add packages/adapters/collector_gh/schemas/doc-issues-v1.0.0-schema.json
git commit -m "chore: update input schema to v1.0.0

- systems field now requires min_length=1
- See packages/adapters/collector_gh/SCHEMA_SYNC.md for details"

# Create release with schema as artifact
# or include schema in release notes / documentation
```

Schema is now available at: `packages/adapters/collector_gh/schemas/doc-issues-v1.0.0-schema.json`

### 5. Downstream Consumers Obtain & Use Schema

Consumers (e.g., collector-gh repo):
- Obtain published schema (from GitHub release, documentation, etc.)
- Integrate into their validation pipeline
- Use to validate data
- **No direct code dependency** on this repo

Example consumer workflow:

```yaml
# .github/workflows/validate-output.yml
- name: Download schema
run: |
curl -O https://github.com/AbsaOSS/living-doc-toolkit/releases/download/v1.0.0/doc-issues-schema.json

- name: Validate output against schema
uses: ajv-validator/ajv-cli@v5
with:
schema: doc-issues-schema.json
data: doc-issues.json
```

## Workflow: When Producer Version Increments

If producer releases `v1.1.0` or `v2.0.0`:

1. **Download their release notes**
2. **Identify breaking vs. non-breaking changes**
3. **If breaking:**
- Update `CONFIRMED_MIN` or `CONFIRMED_MAX` in [compatibility.py](src/living_doc_adapter_collector_gh/compatibility.py)
- Add test fixtures for the new version
- Document in [README.md](README.md)

4. **If non-breaking:**
- Add golden test fixture (no code changes needed)
- Verify compatibility test passes

## File Locations

| File | Purpose |
|------|---------|
| [models.py](src/living_doc_adapter_collector_gh/models.py) | Pydantic models (source of truth) |
| [schema_export.py](src/living_doc_adapter_collector_gh/schema_export.py) | Export models to JSON Schema |
| [compatibility.py](src/living_doc_adapter_collector_gh/compatibility.py) | Version compatibility checking & schema version |
| [__init__.py](src/living_doc_adapter_collector_gh/__init__.py) | Package exports & documentation |
| [tests/test_parser.py](tests/test_parser.py) | Golden tests (fixture validation) |

## Key Constants

```python
# In compatibility.py
CONFIRMED_MIN = "0.1.0" # Min producer version
CONFIRMED_MAX = "2.0.0" # Max producer version (exclusive)
SCHEMA_VERSION = "1.0.0" # Input contract schema version
```

## Testing

### Golden Tests (Verify Fixtures Match Model)

```bash
# Run golden tests
make pytest-unit-packages/adapters/collector_gh

# Specific test
pytest packages/adapters/collector_gh/tests/test_parser.py::TestParser::test_metadata_source_mapping
```

### Schema Export

```bash
# Verify schema can be generated
python -m living_doc_adapter_collector_gh.schema_export

# Write to file
python -m living_doc_adapter_collector_gh.schema_export schema.json
```

## Links

- **Producer Repo:** https://github.com/AbsaOSS/living-doc-collector-gh
- **Consumer (This Repo):** https://github.com/AbsaOSS/living-doc-toolkit
- **Input Contract Docs:** [../../docs/contracts.md](../../docs/contracts.md#input-contract-doc-issuesjson)
45 changes: 45 additions & 0 deletions packages/adapters/collector_gh/schemas/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Input Schema Artifacts

This directory contains the exported JSON Schema for the input contract.

## Schema File

- **`doc-issues-v1.0.0-schema.json`** — JSON Schema for doc-issues.json input data (schema version 1.0.0)

## How to Generate

From the package root (`packages/adapters/collector_gh/`):

```bash
# Generate and save to default location (this directory)
python -m living_doc_adapter_collector_gh.schema_export

# Or specify a custom output location
python -m living_doc_adapter_collector_gh.schema_export /path/to/custom-schema.json
```

## Usage

Downstream consumers (e.g., collector-gh repo) independently:
1. Obtain the published schema from this directory
2. Use it in their validation pipeline
3. Validate input data against the schema

Example with `ajv-cli`:

```bash
ajv validate -s doc-issues-v1.0.0-schema.json -d /path/to/doc-issues.json
```

## Schema Updates

When Pydantic models change:

1. Pydantic models in `src/living_doc_adapter_collector_gh/models.py` are updated
2. Run `python -m living_doc_adapter_collector_gh.schema_export` to regenerate
3. New versioned file is created: `doc-issues-v{VERSION}-schema.json`
4. Commit updated schema
5. Release as new version
6. Downstream consumers obtain and use updated schema

See `SCHEMA_SYNC.md` for complete synchronization workflow.
Loading
Loading