Refactor/pipeline stages#2
Merged
Merged
Conversation
Concise per-session brief for AI assistants (Claude Code, Cursor): canonical-docs pointer, non-negotiable invariants (no word-level output, retrieval-augmented, platform-pays, per-stage cache, augmenta- tion-not-replacement), repo layout, conventions, and a phase status table that mirrors docs/plan/README.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 2 of docs/plan/. Five domain modules under src/audio/, each self-contained and lazy-importing its heavy dep so importing the module is free in tests that don't use that path: * extractor.py: ffmpeg rip to 16 kHz mono WAV with mtime-aware caching under data/audio_cache/<video_id>.wav. * asr.py: faster-whisper wrapper with thread-safe model singleton + word-level WordTiming output. VAD filter on; lazy import. * prosody.py: librosa pyin + RMS at 50 ms stride → ProsodyFrame list with normalized RMS (99th-percentile reference) and voiced flag. * emotion.py: LLM-from-text-and-prosody classifier (no second model on CPU). 7 labels (neutral|happy|sad|angry|anxious|questioning| emphatic), code-fence-tolerant JSON parsing, intensity clamped 0..1, defaults to neutral on malformed/empty. * analyzer.py: ThreadPoolExecutor fuses ASR + prosody in parallel (CPU vs light work), then emotion (depends on both) into one AudioAnalysis. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* AudioIngestStage: download source video via src.audio.source_video, rip audio via src.audio.extractor, emit AudioIngestOutput with a repo-relative WAV path. Fingerprint covers video_id + sample rate. * AudioAnalyzeStage: delegate to src.audio.analyzer.analyze; finger- print covers audio_path + duration + every relevant audio setting (asr_model, compute_type, language, frame strides) + the LLM provider/model — flipping any of those invalidates this stage's cache without disturbing the upstream ingest cache. * pipeline_avatar.py: instantiate both stages; add run_audio_only() helper that returns the typed AudioAnalysis so Phase 3 work can build on top without depending on later phases. Full run() still raises NotImplementedError until Phase 5 lands motion synthesis. * stages/__init__.py: re-export the two new stages. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
10 new tests covering: * AudioIngestStage cache hit/miss behaviour with mocked download + extract (no network, no ffmpeg required to run the test). * AudioAnalyzeStage fingerprint stability + asr_model-changes-cache-key invariant. * Emotion classifier with FakeProvider: valid response, out-of-range clamp to neutral/1.0, malformed JSON falls back to neutral, code-fenced JSON parses, silent windows skip the provider call. * Prosody extractor on a synthetic 440 Hz sine (skipped when librosa isn't installed; passes on environments that have it). * faster-whisper smoke test (skipped when the dep isn't installed; marked slow). requirements.txt: promote Phase 2 deps from commented placeholders to real entries (faster-whisper, librosa, soundfile, numpy). pytest.ini: register the 'slow' marker so the suite runs clean with no warnings. 29 passing + 2 skipped (correctly guarded behind importorskip). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase status board in README.md, docs/plan/README.md, and CLAUDE.md now reflect Phase 2 completion. Phase 3 (interpreter brain) is next and consumes AudioAnalysis via run_audio_only() on the orchestrator. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Walks AudioAnalysis.asr_words and emits InterpreterChunks on either a hard boundary (silence >= vad_min_silence_ms) or a soft boundary (sentence punctuation) once the running text exceeds max_chunk_chars. Each chunk carries dominant emotion, F0 range, RMS mean, speaking rate, and an end-of-chunk pause flag for the interpreter LLM.
System prompt fixes JSON-only output, the seven NMM keys, and the yes/no vs wh-question vs negation NMM rules. Few-shots cover wh-Q, yes/no Q, negation, emphasis, neutral declarative, and a role-shift quote. PROMPT_VERSION participates in the interpreter stage cache fingerprint so prompt edits invalidate just that stage.
plan_chunks() calls the configured LLMProvider once per chunk, strips ```json fences, retries once on parse failure, and falls back to a minimal AslPlanSegment if the model still returns junk. Sign tokens are normalised (UPPERCASE ASCII alnum/underscore), NMM intents clamped to [0, 1], role shifts validated. Gloss filtering against the pose library is deferred to Phase 5.
Two cacheable stages around the Phase 3 domain modules. The semantic chunk fingerprint covers max/min chunk chars and the VAD silence threshold; the interpreter fingerprint folds in PROMPT_VERSION, provider+model, and chunk text — so re-runs are JSON reads and prompt iteration invalidates exactly the interpreter cache. Pipeline.run() still raises until Phase 5 ships motion synthesis.
Chunker: respects max_chunk_chars on long pause-less runs, splits on a hard silence boundary. Planner: one provider call per chunk, retry on malformed JSON, fallback when both attempts fail, NMM clamped to [0, 1], code-fence stripping. InterpreterPlanStage: fingerprint folds in PROMPT_VERSION and chunk text; second .run() with the same input hits the disk cache and skips the provider entirely.
The per-gloss WLASL stitching path is structurally Signed English with NMM dressing, not ASL. The unit of retrieval moves from a gloss keyframe to a continuous Deaf-signed clip — OpenASL as primary index, ASL Citizen as a lexical secondary, WLASL kept only as the last-resort stitching fallback. Each output segment is tagged with a fidelity tier so the consumer can render a badge in dev mode. Adds docs/plan/phase-4-corpus-retrieval.md as the new spec; the old phase-4-pose-library.md is retained with a superseded banner because its content still describes the fallback path correctly. Phase 5 is rewritten end-to-end for the tiered retrieval + retrieved-face NMM behavior.
The previous wording allowed per-gloss WLASL stitching to satisfy the retrieval invariant, which is exactly the loophole that produced Signed English at Phase 5. Default tier is now a continuous Deaf-signed clip retrieved at phrase level; WLASL stitching is permitted only as the tagged fallback. Adds a retrieval config section, OpenASL/ASL Citizen/WLASL tier descriptions, and updates the v5.1 schema sketch + flow diagram.
AslPlanSegment gains four optional fields the Phase 5 motion-synth
stage will populate: query_text (phrase-level retrieval query, can be
emitted by the interpreter brain or composed at synth time),
retrieved_clip_id, retrieval_similarity, and a fidelity tier
("retrieval" | "lexical" | "stitched" | "degraded"). MotionSynthOutput
gains annotated_segments so the timeline stage can carry the tier into
the final AvatarRenderPlan.
All Phase-3-and-earlier code keeps working because the new fields are
optional with safe defaults. Test coverage: the bootstrap roundtrip
test now asserts the v5.1 string and the retrieval fields, plus a new
back-compat test confirms a pre-Phase-5 segment still parses.
* requirements.txt — uncomment mediapipe/opencv and add sentence-
transformers, faiss-cpu, scipy, tqdm. These are needed by the corpus
fetch + index build scripts and the runtime retrieval API.
* .gitignore — exclude corpus video bytes, per-clip pose JSON, and the
WLASL pose-library output. Manifests + the FAISS index file remain
tracked so a fresh clone gets the index for free.
* src/core/config.py — RetrievalSettings (embedding model, phrase /
lexical similarity thresholds, max clip duration, primary/secondary
corpus names) and a corpus_root path entry.
* src/core/paths.py — corpus_{clip,pose}_dir, corpus_manifest_path,
corpus_index_path, corpus_embeddings_path helpers so every Phase 4
module agrees on layout.
* src/core/logging.py — setup_script_logging(name) for the long-running
offline scripts: each invocation writes a timestamped log file under
logs/ so the user can tail it during a multi-hour run without one
script clobbering another's output.
* config.yaml — exposes the new retrieval section with documented
defaults.
vrm_retarget.py — direct-mapping landmarks → VRM humanoid bone quaternions. Mediapipe pose_world_landmarks (Y-down, hip-centered) get flipped to VRM's Y-up frame, then each bone's quaternion is the shortest-arc rotation aligning its rest-pose direction with the relevant landmark-pair vector. Includes the full VRM humanoid bone list (core + 30 finger bones) so the dict is gap-free; missing inputs fall through to identity. Finger joints get segment-to-segment alignments rather than full IK — adequate for the prototype's visible hand articulation; library-based retargeter is a v1.1 task. pose_extractor.py — extract_pose_stream(clip_path, target_fps=30) opens a clip via OpenCV, samples at the target fps, runs Mediapipe Holistic per frame, and yields paired MotionFrame + NmmFrame tracks. NMM frames carry coarse geometric approximations of ARKit blendshapes (jawOpen, brow direction, mouth width, eye openness) so Phase 5 can keep the retrieved signer's natural facial expressions when present. Heavy deps imported lazily; rest_motion_frame helper for the idle pose between segments.
…loader retrieval.py — RetrievalIndex(name=...) lazily loads a FAISS index and the matching sentence-transformer (configured in settings.retrieval.embedding_model). query(text, k) returns ranked RetrievalHits with cosine similarity normalised into [0, 1]; the threshold check is left to the caller (MotionSynthStage). load_poses reads the per-clip pose JSON written by build_corpus_index. An index_signature property gives Phase 5 a cheap cache key. The from_memory classmethod is the test seam — tests/test_retrieval.py exercises the full code path without touching FAISS or the model. pose_library.py — file-backed PoseLibrary keyed by uppercase gloss for the WLASL fallback tier. Lazy: construction touches no disk, only has/get/glosses do; get() is cached after first read. Case-insensitive lookup so callers can pass either "HELLO" or "hello".
… manifest Reads an upstream OpenASL release manifest (TSV/CSV with clip_id, youtube_id, start_seconds, end_seconds, caption_en, optional signer_id) via --source PATH or --source URL. For each row, downloads the source YouTube video once (cached per youtube_id under assets/corpus/openasl/_sources/), then ffmpeg-trims [start, end] to assets/corpus/openasl/<clip_id>.mp4. Probes the trim for actual duration and writes/merges assets/corpus/openasl_manifest.json. Resumable (--no-resume to force re-fetch), parallel (--workers K), manifest flushed every N rows so a Ctrl-C doesn't lose progress. Clips exceeding settings.retrieval.max_clip_duration_ms are skipped at fetch time so we don't waste disk on full lectures. Logging: each invocation writes logs/fetch_openasl-<YYYYMMDD-HHMMSS>.log via setup_script_logging. Console defaults to INFO; pass --log-level DEBUG for per-row detail. The log path is printed at start and end so the user can tail -F it during a multi-hour run.
Two-phase offline build over a corpus manifest: 1. Embedding + FAISS index. Loads the sentence-transformer named in settings.retrieval.embedding_model, encodes every caption with batch=128 and L2 normalization (so inner-product = cosine), and writes assets/corpus/<name>.faiss + <name>_embeddings.npy. 2. Pose extraction. For each clip, runs extract_pose_stream in a child process (ProcessPoolExecutor) so mediapipe state stays isolated and a single crash doesn't poison the whole batch. Output: assets/corpus/<name>_poses/<clip_id>.json — the runtime RetrievalIndex.load_poses() consumes this shape directly. CLI flags --skip-poses / --skip-index let the user re-do just one half (e.g. after changing the embedding model). --limit N for smoke runs. --workers K (default 2) for pose extraction parallelism. Progress lines emit every 25 clips with rate + ETA so the user can tell whether a multi-hour run is on track. Logging mirrors fetch_openasl: timestamped log file under logs/.
…gate Loads tests/fixtures/retrieval_eval.json (10 hand-curated English chunks across wh-Q, yes/no Q, negation, topic-comment, classifier, role shift, time anchor, numeric, and two neutral declaratives), queries the OpenASL index for top-k hits per chunk, and prints each hit's caption + clip id to console plus a side-by-side markdown table to logs/retrieval_eval-<YYYYMMDD-HHMMSS>.md. This is the human-in-the-loop gate documented in docs/plan/phase-4-corpus-retrieval.md Verification: at least 7 of 10 chunks must have an on-target top-3 to proceed to Phase 5. Automated pass/fail would be wrong here — ASL semantic match is too subjective for a regex test. Defaults to k=3 to match the gate criterion; --k 5 for wider exploration.
Builds the per-gloss WLASL pose library used by Phase 5 as the last-resort fallback tier. Walks assets/word_manifest.json, picks the best clip per gloss honoring preferred_signer_ids from the manifest, runs extract_pose_stream per clip, and writes a PoseLibraryEntry JSON to assets/pose_library/<GLOSS>.json. Defaults to --limit 500 per the corpus-retrieval pivot — the full 2000-entry build is no longer the primary asset. Use --all to build everything (several hours on CPU), --gloss HELLO to debug a single sign, --force to re-extract over an existing JSON. Logging is the same setup_script_logging shape used by the corpus scripts; progress lines every 25 clips with clips/s rate.
vrm_retarget — quaternion unit-norm + identity-for-equal-vectors + 180-degree-flip + synthetic T-pose landmarks producing near-identity arm quats + bent-arm producing a non-identity lower-arm quat. The tests use plain (x, y, z) tuples / tiny objects with .x/.y/.z so mediapipe isn't imported. pose_library — known/unknown gloss lookup, case-insensitive lookup, glosses-property listing, get()-caching, and a lazy-no-disk-touch-at- init test that constructs the library then adds a file and confirms has() picks it up. retrieval — RetrievalIndex.from_memory test seam used to bypass FAISS / sentence-transformers. Exact-caption query returns top-1 at sim ~1.0; lexical-overlap query soft-matches the closest caption; empty / whitespace query returns []; load_poses reads disk lazily and raises FileNotFoundError on a missing clip id; index_signature changes when the manifest grows. Total: 16 new tests, all green alongside the existing 43.
…roach Rewrite the business docs from two competing layers (v1 word-level-learner plan + v2 feasibility study) into one coherent plan aligned with the committed approach: retrieval-augmented, grammar-aware, phrase-level ASL; platform-pays; no word-level output. Refresh all market data to May 2026. - Promote the six numbered docs to the canonical plan; reframe feasibility-study/ as the technical & feasibility appendix. - Refresh regulatory drivers: ADA Title II deadline extended to 2027/2028; EAA live since June 2025; 2025 litigation rebound (~3,900, +24%). - Add Sorenson (Hand Talk + OmniBridge acquisition, April 2026 avatar POCs) as the now-live incumbent threat across competitive sections. - Re-derive TAM/SAM/SOM; fold induced-demand model into market analysis. - Align F1 Stage 3 to phrase-level-retrieval-first; note Phases 1-3 shipped. - Map the product roadmap to actual pipeline phases 4-7. - Scrub dead "v1 plan" references; cite real corpora (OpenASL, ASL Citizen). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR advances the pipeline from a “skeleton” toward an operational Phase 2–4 implementation by adding concrete cached stages for the audio backbone and interpreter planning, introducing Phase 4 corpus retrieval + pose extraction infrastructure (OpenASL ingestion, FAISS index build, pose extraction), and updating the v5 schema/documents accordingly.
Changes:
- Add cached pipeline stages for audio ingest/analyze and semantic chunking/interpreter planning, plus a
run_audio_only()pipeline helper. - Introduce Phase 4 corpus retrieval layer (RetrievalIndex runtime API, Mediapipe pose extraction, offline scripts, fixtures) and corresponding settings/config/docs.
- Bump schema to v5.1 and add retrieval metadata fields to plan segments; add extensive unit tests for audio/interpreter/retrieval/pose library.
Reviewed changes
Copilot reviewed 58 out of 60 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_vrm_retarget.py | Adds unit tests for VRM retarget quaternion math and landmark→bone mapping behavior. |
| tests/test_retrieval.py | Adds RetrievalIndex tests using in-memory FAISS/embedder doubles and on-disk pose JSON fixtures. |
| tests/test_pose_library.py | Adds tests for the lazy, cached WLASL pose-library loader. |
| tests/test_interpreter_planner.py | Adds tests for chunker + interpreter planner parsing/retry/fallback behavior and stage cache/fingerprint stability. |
| tests/test_avatar_pipeline_bootstrap.py | Updates schema round-trip test to v5.1 and adds back-compat coverage for new segment fields. |
| tests/test_audio_analyzer.py | Adds tests for audio ingest/analyze caching and emotion/prosody/ASR behaviors (with optional heavy deps). |
| tests/fixtures/retrieval_eval.json | Adds a hand-curated retrieval evaluation fixture for the Phase 4 quality gate script. |
| src/pipeline/stages/semantic_chunk.py | Introduces Stage 3 wrapper around chunker with disk caching + fingerprinting. |
| src/pipeline/stages/interpreter_plan.py | Introduces Stage 4 wrapper around planner with disk caching + fingerprinting. |
| src/pipeline/stages/audio_ingest.py | Introduces Stage 1 (download + extract WAV) with disk caching + fingerprinting. |
| src/pipeline/stages/audio_analyze.py | Introduces Stage 2 (ASR+prosody+emotion fusion) with disk caching + fingerprinting. |
| src/pipeline/stages/init.py | Exposes concrete stage classes via package exports. |
| src/pipeline/pipeline_avatar.py | Wires Phase 2–3 stages and adds run_audio_only(); keeps full run() as NotImplemented until Phase 5. |
| src/pipeline/models.py | Bumps schema_version to 5.1; adds retrieval-related metadata fields to AslPlanSegment and annotated_segments to MotionSynthOutput. |
| src/interpreter/prompt.py | Adds interpreter persona system prompt + few-shot examples and message builders; introduces PROMPT_VERSION. |
| src/interpreter/planner.py | Adds robust planner implementation (JSON extraction, retry, fallback, gloss/NMM normalization). |
| src/interpreter/chunker.py | Adds chunker implementation producing InterpreterChunks with emotion/prosody summaries and boundary logic. |
| src/interpreter/init.py | Adds interpreter package marker docstring. |
| src/core/paths.py | Adds corpus path helpers for Phase 4 assets layout. |
| src/core/logging.py | Adds setup_script_logging() for per-script timestamped logs; improves module-level docs. |
| src/core/config.py | Adds RetrievalSettings and paths.corpus_root; wires retrieval into Settings. |
| src/avatar/retrieval.py | Adds RetrievalIndex runtime API with lazy FAISS/embedder loading, query, and pose loading. |
| src/avatar/pose_library.py | Adds lazy WLASL pose-library loader used as last-resort fallback tier. |
| src/avatar/pose_extractor.py | Adds Mediapipe Holistic pose extraction producing MotionFrame + NmmFrame streams with rest-pose fallbacks. |
| src/avatar/init.py | Adds avatar package marker docstring. |
| src/audio/prosody.py | Adds librosa-based prosody extraction with lazy heavy imports. |
| src/audio/extractor.py | Adds ffmpeg-based audio extraction with caching and ffprobe duration probing. |
| src/audio/emotion.py | Adds LLM-based emotion classification over text+prosody summary with robust parsing/clamping. |
| src/audio/asr.py | Adds faster-whisper wrapper with lazy model load + thread-safe cache. |
| src/audio/analyzer.py | Adds parallel ASR/prosody execution and fuses results with emotion into AudioAnalysis. |
| scripts/retrieval_eval.py | Adds human-in-the-loop retrieval quality gate script that emits console logs + markdown report. |
| scripts/fetch_openasl.py | Adds OpenASL corpus fetch/trim/manifest builder with concurrency and resumability. |
| scripts/build_pose_library.py | Adds WLASL fallback pose-library builder producing per-gloss PoseLibraryEntry JSON. |
| scripts/build_corpus_index.py | Adds corpus embedding + FAISS index build and per-clip pose extraction (multi-process). |
| requirements.txt | Promotes Phase 2/4 dependencies (whisper/librosa/mediapipe/faiss/sentence-transformers, etc.) into requirements. |
| README.md | Updates phase status table to mark Phases 2–3 as done. |
| pytest.ini | Adds slow marker definition and guidance for skipping heavy/slow tests. |
| docs/plan/README.md | Updates phase roadmap and introduces Phase 4 corpus retrieval plan entry. |
| docs/plan/phase-4-pose-library.md | Archives the original Phase 4 pose-library plan and links to the new Phase 4 retrieval plan. |
| docs/plan/phase-4-corpus-retrieval.md | Adds the new Phase 4 corpus ingestion + phrase-level retrieval index plan document. |
| docs/plan/00-architecture.md | Updates architecture diagram/text for phrase-level retrieval and schema v5.1. |
| docs/architecture-overview.md | Updates canonical architecture overview to phrase-level retrieval and fidelity tagging. |
| config.yaml | Adds retrieval settings defaults (embedding model, thresholds, corpus names). |
| CLAUDE.md | Adds repository “working agreement” with invariants and updated phase status. |
| business/README.md | Rewrites business plan framing to the committed phrase-level retrieval approach. |
| business/feasibility-study/README.md | Reframes feasibility docs as a technical appendix supporting the unified plan. |
| business/feasibility-study/05-feasibility-verdict.md | Updates verdict language to align with the unified plan and incumbent threat framing. |
| business/feasibility-study/04-pricing-strategy-comparison.md | Updates terminology and clarifies hybrid model references. |
| business/feasibility-study/03-market-expansion.md | Updates induced-demand framing and references to the old consumer-learner thesis. |
| business/feasibility-study/02-competitive-tech-comparison.md | Updates competitor section (Sorenson/Hand Talk) and threat framing. |
| business/feasibility-study/01-technology-feasibility.md | Updates feasibility framing and Stage 3 (retrieval-first) architecture details. |
| business/04-value-proposition.md | Updates value proposition to platform-pays + phrase-level retrieval approach. |
| business/03-competitive-landscape.md | Updates competitive landscape to the five-family taxonomy and corpus-as-moat framing. |
| business/01-executive-summary.md | Updates executive summary to reflect committed architecture and updated regulatory runway. |
| .gitignore | Ignores large Phase 4 corpus artifacts (clips/poses/embeddings) and pose_library outputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+25
to
+38
| def fingerprint(self, inp: SemanticChunkInput) -> str: | ||
| s = self.settings | ||
| analysis = inp.analysis | ||
| return stable_hash([ | ||
| "semantic_chunk", | ||
| analysis.duration_ms, | ||
| len(analysis.asr_words), | ||
| # Include first/last word to detect content drift cheaply. | ||
| analysis.asr_words[0].word if analysis.asr_words else "", | ||
| analysis.asr_words[-1].word if analysis.asr_words else "", | ||
| s.interpreter.max_chunk_chars, | ||
| s.interpreter.min_chunk_chars, | ||
| s.audio.vad_min_silence_ms, | ||
| ]) |
Comment on lines
+40
to
+50
| def process(self, inp: InterpreterPlanInput) -> InterpreterPlanOutput: | ||
| segments, provider, model = plan_chunks( | ||
| inp.chunks, settings=self.settings.interpreter | ||
| ) | ||
| logger.info( | ||
| "InterpreterPlanStage: %d segments via %s/%s", | ||
| len(segments), provider, model, | ||
| ) | ||
| return InterpreterPlanOutput( | ||
| segments=segments, provider=provider, model=model | ||
| ) |
Comment on lines
+44
to
+46
| def process(self, inp: AudioAnalyzeInput) -> AudioAnalyzeOutput: | ||
| wav_path = PROJECT_ROOT / inp.audio_path | ||
| analysis = analyze(wav_path, inp.duration_ms) |
Comment on lines
+20
to
+34
| from pydantic import BaseModel | ||
|
|
||
| from src.core.paths import PROJECT_ROOT | ||
| from src.pipeline.models import MotionFrame, NmmFrame | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class PoseLibraryEntry(BaseModel): | ||
| gloss: str | ||
| duration_ms: int | ||
| fps: int = 30 | ||
| source_clip: str = "" | ||
| keyframes: list[MotionFrame] | ||
| nmm: list[NmmFrame] = [] |
Comment on lines
+31
to
+33
| # Full pull, 4 parallel workers, resume previous run | ||
| python -m scripts.fetch_openasl --source path/to/openasl.tsv \ | ||
| --workers 4 --resume |
Comment on lines
+92
to
+96
| path = Path(source).expanduser().resolve() | ||
| if not path.is_file(): | ||
| raise FileNotFoundError(f"Source manifest not found: {path}") | ||
| logger.info("Reading source manifest %s", path) | ||
| handle = path.open("r", encoding="utf-8") |
Comment on lines
+88
to
+96
| @property | ||
| def index_signature(self) -> str: | ||
| """Cheap stable hash of the index file's mtime + manifest length.""" | ||
| manifest_n = len(self._load_manifest()) | ||
| try: | ||
| mtime = int(self.index_path.stat().st_mtime) | ||
| except FileNotFoundError: | ||
| mtime = 0 | ||
| return f"{self.name}:{manifest_n}:{mtime}" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.