Skip to content

Refactor: ensure all task sources execute from a JSON file on disk #139

Description

@nicolasbisurgi

Background

The original RushTI 2.0 architecture intended that all task sources (TXT file, JSON file, TM1 cube view) would be normalised to a JSON file on disk before execution. This ensures a single, consistent execution path and makes debugging easier since there is always a JSON file to inspect.

Current Behaviour

Currently, the TM1 cube path (--tm1-instance + --workflow) builds a Taskfile Python object directly in memory from an MDX query result (DataFrame), and the DAG is constructed from that in-memory object. While a JSON archive is written as a side-effect (archive_taskfile), it is not part of the execution pipeline.

The file-based paths also read directly into memory rather than going through a normalised JSON intermediate.

Proposed Change

Refactor the execution pipeline so that:

  1. TM1 cube source: Read from cube → write normalised JSON to a temp/archive path → execute from that JSON file
  2. TXT file source: Convert TXT → write normalised JSON → execute from that JSON file
  3. JSON file source: Validate and normalise encoding → execute from the (possibly re-encoded) JSON file

This would give a single code path for DAG construction (parse_json_taskfileconvert_json_to_dag) regardless of the original source.

Benefits

  • Single execution path simplifies maintenance and testing
  • Encoding normalisation (see Invalid Continuation Byte #137) applies uniformly to all sources
  • Always have a JSON file on disk for debugging and auditing
  • Archive becomes the input, not a side-effect

Related

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions