Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 105 additions & 3 deletions src/services/VideoProcessor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -424,6 +424,79 @@ export class VideoProcessor {
});
}

/**
* Validate file integrity by running ffmpeg error detection pass.
* Detects corruption, truncation, and bitstream errors that probe alone misses.
*/
private async validateFileIntegrity(filePath: string): Promise<{ corrupted: boolean; fatal: boolean; errors: string[] }> {
const { execFile } = await import('child_process');
const ffmpegPath = this.config.encoder?.ffmpeg_path || 'ffmpeg';

return new Promise((resolve) => {
const proc = execFile(ffmpegPath, ['-v', 'error', '-i', filePath, '-f', 'null', '-'], {
timeout: 120000,
}, (error, _stdout, stderr) => {
Comment on lines +435 to +438
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Timeout handling produces confusing message; 120s may be insufficient for large files.

When timeout occurs, error.code is 'ETIMEDOUT' (string), resulting in the message "ffmpeg integrity check exited with code ETIMEDOUT" — which isn't user-friendly.

Additionally, for 2+ hour videos mentioned in PR objectives, a 120-second validation timeout may be too short when reading every frame.

♻️ Suggested improvements
 return new Promise((resolve) => {
-  const proc = execFile(ffmpegPath, ['-v', 'error', '-i', filePath, '-f', 'null', '-'], {
-    timeout: 120000,
+  execFile(ffmpegPath, ['-v', 'error', '-i', filePath, '-f', 'null', '-'], {
+    timeout: 300000, // 5 minutes for large files
   }, (error, _stdout, stderr) => {
     // ... existing code ...
     
     // Non-zero exit with no recognized patterns — flag as potentially corrupt
-    if (error && !corrupted && !fatal && errors.length === 0) {
-      const exitCode = (error as any).code;
-      if (exitCode) {
-        errors.push(`ffmpeg integrity check exited with code ${exitCode}`);
+    if (error && !corrupted && !fatal && errors.length === 0) {
+      const exitCode = (error as any).code;
+      if (exitCode === 'ETIMEDOUT') {
+        errors.push('Integrity check timed out — file may be too large or unreadable');
+        corrupted = true;
+      } else if (exitCode) {
+        errors.push(`ffmpeg integrity check exited with code ${exitCode}`);
         corrupted = true;
       }
     }

Also applies to: 480-487

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/VideoProcessor.ts` around lines 435 - 438, The ffmpeg integrity
check spawned with execFile (using ffmpegPath and filePath) treats an ETIMEDOUT
as a numeric exit code string and uses a confusing message and a hardcoded
120000ms timeout that may be too short for large/long videos; update the Promise
callback in VideoProcessor to (1) detect timeout explicitly (error && error.code
=== 'ETIMEDOUT' or error.killed) and produce a clear message like "ffmpeg
integrity check timed out" (include filePath for context), (2) make the timeout
configurable (e.g., use a parameter or a constant like ffmpegValidateTimeout) or
derive it from file size/duration rather than a fixed 120000, and (3) ensure
non-timeout ffmpeg failures still surface their numeric exit code or stderr in
the error message so execFile's callback returns an informative error for both
timeout and other failures.

const output = stderr || '';
const errors: string[] = [];
let fatal = false;
let corrupted = false;

// Fatal: file is unplayable
if (output.includes('moov atom not found')) {
errors.push('moov atom not found (truncated MP4 — likely incomplete download)');
fatal = true;
}
if (output.includes('Invalid data found when processing input') && !output.includes('NAL unit')) {
errors.push('Invalid data found at container level');
fatal = true;
}
if (output.includes('End of file') || output.includes('error reading header')) {
errors.push('File is truncated (unexpected end of file)');
corrupted = true;
}

// Recoverable: corruption that error-tolerant flags can handle
if (output.includes('Invalid NAL unit')) {
errors.push('Invalid NAL unit sizes (H.264 bitstream corruption)');
corrupted = true;
}
if (output.includes('partial file')) {
errors.push('Partial/truncated file detected');
corrupted = true;
}
if (output.includes('error while decoding') || output.includes('decode_slice_header')) {
errors.push('Frame decoding errors detected');
corrupted = true;
}
if (output.includes('non-existing PPS') || output.includes('non-existing SPS')) {
errors.push('H.264 parameter sets missing or corrupted');
corrupted = true;
}
if (output.includes('missing picture in access unit')) {
errors.push('Missing pictures in access units');
corrupted = true;
}

// Non-zero exit with no recognized patterns — flag as potentially corrupt
if (error && !corrupted && !fatal && errors.length === 0) {
const exitCode = (error as any).code;
if (exitCode) {
errors.push(`ffmpeg integrity check exited with code ${exitCode}`);
corrupted = true;
}
}

if (errors.length > 0) {
logger.warn(`🔍 File integrity check: ${errors.join('; ')}`);
} else {
logger.info('✅ File integrity check passed');
}

resolve({ corrupted: corrupted || fatal, fatal, errors });
});
});
}

/**
* Get bit depth from pixel format string
*/
Expand Down Expand Up @@ -697,14 +770,22 @@ export class VideoProcessor {
}

// 11. 🚨 CYRILLIC/UNICODE METADATA: Handle encoding issues
const hasUnicodeMetadata = probe.rawMetadata?.format?.tags?.title &&
const hasUnicodeMetadata = probe.rawMetadata?.format?.tags?.title &&
/[^\x00-\x7F]/.test(probe.rawMetadata.format.tags.title);
if (hasUnicodeMetadata) {
// Strip problematic metadata that might cause encoding failures
strategy.extraOptions.push('-map_metadata', '-1');
reasons.push('unicode metadata detected - strip to prevent encoding issues');
}

// 12. 🩹 FILE CORRUPTION: Add error-tolerant decoding flags
const hasCorruption = probe.issues.some(i => i.type === 'file_corruption');
if (hasCorruption) {
strategy.inputOptions.push('-err_detect', 'ignore_err');
strategy.inputOptions.push('-fflags', '+discardcorrupt+genpts');
reasons.push('corruption detected - using error-tolerant decoding');
}
Comment on lines +781 to +787
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Input options pushed as separate elements — works but fragile.

Pushing -err_detect and ignore_err as separate array elements works because the worker iterates and FFmpeg will parse them correctly in sequence. However, this is inconsistent with typical fluent-ffmpeg patterns and could break if iteration order changes.

♻️ More robust approach
 const hasCorruption = probe.issues.some(i => i.type === 'file_corruption');
 if (hasCorruption) {
-  strategy.inputOptions.push('-err_detect', 'ignore_err');
-  strategy.inputOptions.push('-fflags', '+discardcorrupt+genpts');
+  strategy.inputOptions.push('-err_detect ignore_err');
+  strategy.inputOptions.push('-fflags +discardcorrupt+genpts');
   reasons.push('corruption detected - using error-tolerant decoding');
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// 12. 🩹 FILE CORRUPTION: Add error-tolerant decoding flags
const hasCorruption = probe.issues.some(i => i.type === 'file_corruption');
if (hasCorruption) {
strategy.inputOptions.push('-err_detect', 'ignore_err');
strategy.inputOptions.push('-fflags', '+discardcorrupt+genpts');
reasons.push('corruption detected - using error-tolerant decoding');
}
// 12. 🩹 FILE CORRUPTION: Add error-tolerant decoding flags
const hasCorruption = probe.issues.some(i => i.type === 'file_corruption');
if (hasCorruption) {
strategy.inputOptions.push('-err_detect ignore_err');
strategy.inputOptions.push('-fflags +discardcorrupt+genpts');
reasons.push('corruption detected - using error-tolerant decoding');
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/VideoProcessor.ts` around lines 781 - 787, The current
hasCorruption branch pushes FFmpeg flags as separate array elements which is
fragile; update the block that checks probe.issues (the hasCorruption variable)
to push each option as a single combined string into strategy.inputOptions
(e.g., combine flag and value into one element for the err_detect option and
keep -fflags with its combined value) so that the options are atomic when later
iterated; reference the hasCorruption check and the strategy.inputOptions array
when making this change.


strategy.reason = reasons.length > 0 ? reasons.join(', ') : 'standard processing';

return strategy;
Expand Down Expand Up @@ -739,13 +820,34 @@ export class VideoProcessor {
logger.info(`📥 Downloading source video for job ${jobId}`);
await this.downloadVideo(job.input.uri, sourceFile);

// 🔍 NEW: Probe input file to detect format and compatibility issues
// 🔍 Validate file integrity before probing/encoding
logger.info('🔍 Checking file integrity...');
const integrity = await this.validateFileIntegrity(sourceFile);
if (integrity.fatal) {
throw new Error(
`File is fatally corrupted: ${integrity.errors.join('; ')}. ` +
`Cannot process this file. Source: ${job.input.uri}`
);
}

// 🔍 Probe input file to detect format and compatibility issues
logger.info(`🔍 Probing input file for compatibility...`);
let probeResult: FileProbeResult | null = null;
let encodingStrategy: EncodingStrategy | null = null;

try {
probeResult = await this.probeInputFile(sourceFile);

// Merge integrity results into probe issues
if (integrity.corrupted && probeResult) {
probeResult.issues.push({
type: 'file_corruption',
severity: 'warning',
message: `File integrity issues: ${integrity.errors.join('; ')}`,
suggestion: 'Will use error-tolerant decoding flags',
});
}

encodingStrategy = this.determineEncodingStrategy(probeResult);

logger.info(`✅ Probe complete - Strategy: ${encodingStrategy.reason}`);
Expand Down
23 changes: 23 additions & 0 deletions src/workers/VideoEncodingWorker.ts
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,14 @@ async function encodeProfile(task: EncodingTask): Promise<void> {
const segmentFiles = files.filter(f => f.endsWith('.ts'));
const stats = await fs.stat(outputPath);

// Validate output integrity
if (segmentFiles.length === 0) {
throw new Error('Encoding produced no output segments — possible input corruption or encoding failure');
}
if (stats.size === 0) {
throw new Error('Playlist file is empty (0 bytes) — encoding likely failed silently');
}

const result = {
profile: profile.name,
path: outputPath,
Expand Down Expand Up @@ -315,6 +323,21 @@ async function encodeProfile(task: EncodingTask): Promise<void> {
}
}

// Corruption-related error detection
if (errorMsg.includes('Invalid data found')) {
errorDetails = 'Input file contains corrupted data. The file may be incomplete or damaged.';
} else if (errorMsg.includes('moov atom not found')) {
errorDetails = 'MP4 container is truncated (moov atom missing). The download was likely incomplete.';
} else if (errorMsg.includes('error while decoding') || errorMsg.includes('decode_slice_header')) {
errorDetails = 'Video frames are corrupted. Some frames could not be decoded.';
} else if (errorMsg.includes('non-existing PPS') || errorMsg.includes('non-existing SPS')) {
errorDetails = 'H.264 parameter sets are missing or corrupted.';
}
if (errorMsg.includes('183') || errorMsg.includes('exit code 183')) {
errorDetails = (errorDetails ? errorDetails + ' ' : '') +
'FFmpeg exit code 183 indicates input file corruption. Try re-downloading the source.';
}
Comment on lines +336 to +339
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Exit code 183 detection pattern is overly broad — risk of false positives.

errorMsg.includes('183') will match any string containing "183", including frame numbers (frame=1830), timestamps (183.5), or other numeric values. This could incorrectly append the corruption message to unrelated errors.

🔧 Suggested fix: use more specific pattern
-if (errorMsg.includes('183') || errorMsg.includes('exit code 183')) {
+if (/\b(exit\s+)?(code\s+)?183\b/i.test(errorMsg) || errorMsg.includes('exited with code 183')) {
   errorDetails = (errorDetails ? errorDetails + ' ' : '') +
     'FFmpeg exit code 183 indicates input file corruption. Try re-downloading the source.';
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/workers/VideoEncodingWorker.ts` around lines 336 - 339, The current check
uses errorMsg.includes('183') which is too broad and yields false positives;
update the condition in the FFmpeg error handling (the block referencing
errorMsg and errorDetails where you append "FFmpeg exit code 183...") to match
only explicit exit/code phrases using a stricter pattern (e.g., test for
whole-word sequences like "exit code 183" or "code 183" via a regex such as a
word-boundary match) so frame numbers or timestamps like "1830" won't trigger
the corruption message; keep the existing append logic to errorDetails but only
run it when the refined match against errorMsg succeeds.


// Send error to main thread
sendMessage({
type: 'error',
Expand Down