Skip to content

Vendor wespeaker inference to eliminate pyannote Dockerfile stubs #8105

Description

@beastoin

Context

PR #8082 added built-in wespeaker speaker embedding for batch diarization. PR #8085 patches the Dockerfile with 3 stubs to work around NGC ABI incompatibility with torchaudio/torch_audiomentations/opentelemetry.

Research by @geni (PR #8085 comment) confirmed:

  • No NVIDIA-published torchaudio wheel exists for NGC (known issue since 2023)
  • No pyannote inference-only install mode (torch_audiomentations is a hard dep)
  • pyannote 4.0 adds torchcodec as hard dep (another C extension ABI crash) — <4.0 pin is critical
  • All NGC users use the same stub workaround or avoid torchaudio entirely

Proposal

Vendor the wespeaker inference path (~60 lines) to eliminate pyannote.audio + ~25 transitive deps + all 3 Dockerfile stubs:

  1. torchaudio.compliance.kaldi.fbank — pure Python, already working in NGC via the patched init
  2. ResNet34 forward pass — load weights from HuggingFace, run inference directly
  3. L2 normalize output → 256-dim embedding

This removes:

  • pyannote.audio, pyannote.core, pyannote.database, pyannote.pipeline
  • speechbrain, asteroid-filterbanks, einops, torch_audiomentations (stub)
  • tensorboardX, hf_transfer, semver
  • All 3 Dockerfile stubs (torchaudio patch, torch_audiomentations stub, telemetry stub)
  • The <4.0 version pin (no longer needed)

Files

  • backend/parakeet/transcribe.py:33-65 — current pyannote import + get_builtin_embedding_model()
  • backend/parakeet/Dockerfile — current stubs

Acceptance criteria

  • Vendored wespeaker inference produces identical 256-dim embeddings to pyannote Inference
  • DER benchmark matches current implementation
  • All 3 Dockerfile stubs removed
  • pyannote.audio and transitive deps removed from Dockerfile
  • Existing unit tests updated and passing
  • Fallback to HTTP diarizer preserved

Metadata

Metadata

Assignees

No one assigned

    Labels

    p3Priority: Backlog (score <14)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions