Skip to content

Feature/viewer asset flow#16

Open
darrell-d wants to merge 12 commits intomainfrom
feature/viewer-asset-flow
Open

Feature/viewer asset flow#16
darrell-d wants to merge 12 commits intomainfrom
feature/viewer-asset-flow

Conversation

@darrell-d
Copy link
Copy Markdown
Contributor

@darrell-d darrell-d commented Apr 30, 2026

Switched the post-processor into a viewer-asset-aware flow: it now creates a viewer asset in packages-service, links each input package to it, creates channels with the asset id added to the row via pennsieve-api, uploads chunk files directly to S3 using STS credentials returned by packages-service, and registers ranges through timeseries-service's new POST /package/{id}/ranges endpoint

Previous legacy flow still reachable via a LEGACY_IMPORT_FLOW env var for rollback.

Re-running the same workflow is now idempotent (existing active assets short-circuit; failed prior runs are deleted and replaced), failures are recoverable (channels created in a failed run are torn down before the asset itself, so the next run starts clean), and the HTTP clients are safer (explicit timeouts on every request, retries on 5xx and connection-level errors, fail-fast on 4xx and on any malformed input file). Asset metadata is configurable per converter via ASSET_NAME

Also lots of unit tests. Thanks Claude

darrell-d and others added 12 commits April 29, 2026 17:37
- new clients (PackagesAssetsClient and TimeSeriesRangesClient)
- Create and add viewer asset ids to channels table
- Attempts to fail and retry gracefully
Covers the new clients and end-to-end orchestration:
- PackagesAssetsClient: create/list/update/delete plus retry behavior
  (5xx-with-recovery, no-retry-on-4xx).
- TimeSeriesRangesClient: create + batched submission, oversize batch
  rejection, abort-on-first-failure, snake_case body shape.
- AssetUploader: STS-bound boto3 client construction, parallel uploads
  with prefixed keys, ClientError propagation, empty-input no-op.
- Integration tests for import_timeseries_via_assets: happy path,
  idempotent skip on active asset, stale-asset delete-and-recreate,
  upload failure triggers asset DELETE for cleanup, empty directory
  short-circuit.

225 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Endpoint can take 10x of what we were sending
Keys were committed to dev.env. That key was found and removed so it's no longer viable. Still embarassingly in history, but useless to anyone who tries it.

using dev.env.example that can remain commited and git ignoring dev.env going forward
_find_or_create_asset was not looking at chlldren for assets, only the parent which would always come back blank.
Covers the bug fix in 6f49f5f: when a workflow has multiple packages,
re-running an ingest finds the existing asset by iterating
workflow.package_ids rather than looking up by the parent
collection (which is never linked to the asset).

The test sets up 3 children + 1 parent, registers an active asset
linked only to the children, and asserts:
  - lookup-by-first-child returns empty
  - lookup-by-second-child returns the asset and short-circuits
  - third child is never queried (iteration stops on first hit)
  - no upload, no POST /assets — purely idempotent

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Items 4-8 from review feedback, plus the fail-fast changes for
unparseable filenames.

Failure-recovery fixes:
- _create_or_resolve_channels now returns the list of channels it
  *created* (not reused). On exception, the orchestrator deletes
  those channels first, then the asset. Without this, partial
  failure left channels with viewer_asset_id pointing at a deleted
  asset, breaking the next re-run on the mismatch check.
- New delete_channel method on TimeSeriesClient for that cleanup
  path. Failures are logged but do not block asset deletion —
  best-effort.
- update_asset(status='active') failures no longer swallowed. They
  propagate, trigger cleanup, and let the next run start clean.
  Swallowing left assets stuck in 'created', misclassified as stale
  on re-run, and triggered delete-and-recreate that compounded the
  channel-mismatch problem.

Fail-fast on malformed input:
- Unparseable channel metadata filename → RuntimeError (was warning
  + skip).
- Unparseable chunk filename → RuntimeError.
- Chunk pointing at unknown channel index → RuntimeError.
- Empty channels_by_index or renamed_data_files after the helpers
  → RuntimeError. Defensive belt-and-suspenders to ensure the
  asset never reaches 'active' with zero data.

HTTP client robustness:
- @backoff decorators now catch requests.RequestException (was just
  HTTPError). Covers timeouts, connection resets, DNS failures —
  they retry instead of failing immediately.
- _is_client_error gives up only on 4xx responses; everything else
  retries.
- DEFAULT_TIMEOUT=(5,30) constant in base_client; threaded into
  every request in packages_assets_client and
  timeseries_ranges_client. No more indefinite hangs on stalled
  services.
- Dropped unused asdict import.

Tests:
- TestFailureCleanup split into two: channel-cleanup-on-failure
  asserts both DELETEs (channel then asset) fire in order;
  reused-channels-not-deleted asserts an existing channel reused
  by this run is left alone on cleanup.
- TestFailFastOnMalformedFiles covers the three RuntimeError
  paths (bad metadata filename, bad chunk filename, orphan chunk).

230 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Also cleaned up some comments for readability and truth
@darrell-d darrell-d requested a review from rohanshah April 30, 2026 20:22
Copy link
Copy Markdown
Collaborator

@rohanshah rohanshah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@darrell-d this is interesting because it would be cleaner to just use the viewer asset data target but we need the viewer asset ID to create the channels...

not sure if you've discussed with @jwagenaar but I wonder if he has any thoughts around that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants