Skip to content

Refactor duplicated codec encoding workflow into shared helpers #2

Description

@riley-1995

Problem

The codec adapters in encode/encode_encodec.py and encode/encode_speechtokenizer.py duplicate major workflow steps (audio preparation, device handling, per-layer embedding extraction loops). This duplication increases maintenance cost and drift risk when one adapter changes and the other does not.

Proposal

Extract shared helper functions for common encode workflow behavior first, while keeping codec-specific model loading and quantizer access in separate modules.

Evaluate whether a full merge into a single module is still beneficial after shared helpers are in place.

Acceptance Criteria

  • Shared workflow helpers are extracted and used by both codec modules.
  • Public function behavior and return structure remain unchanged.
  • encode/collect.py integration continues to work without semantic changes.
  • A follow-up note records whether a full merge into one module is still recommended after helper extraction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions