Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ dependencies = [
"demucs",
"ffmpeg-python",
"matplotlib",
"musicbrainzngs",
"eyed3",
"python-Levenshtein",
"pretty-midi",
"unidecode",
Expand Down
48 changes: 37 additions & 11 deletions src/UltraSinger.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""UltraSinger uses AI to automatically create UltraStar song files"""

import copy
import eyed3
import getopt
import os
import sys
Expand Down Expand Up @@ -64,7 +65,6 @@
from modules.common_print import print_support, print_help, print_version
from modules.os_helper import check_file_exists, get_unused_song_output_dir
from modules.plot import create_plots
from modules.musicbrainz_client import search_musicbrainz
from modules.sheet import create_sheet
from modules.ProcessData import ProcessData, ProcessDataPaths, MediaInfo
from modules.DeviceDetection.device_detection import check_gpu_support
Expand Down Expand Up @@ -646,6 +646,8 @@ def transcribe_audio(cache_folder_path: str, processing_audio_path: str) -> Tran

def infos_from_audio_video_input_file() -> tuple[str, str, str, MediaInfo]:
"""Infos from audio/video input file"""
import eyed3

basename = os.path.basename(settings.input_file_path)
basename_without_ext = os.path.splitext(basename)[0]

Expand All @@ -655,8 +657,32 @@ def infos_from_audio_video_input_file() -> tuple[str, str, str, MediaInfo]:
else:
title = basename_without_ext

song_info = search_musicbrainz(title, artist)
basename_without_ext = f"{song_info.artist} - {song_info.title}"
# Read ID3 tags from file
file_artist, file_title, year, genres, cover_image_data = None, None, None, [], None

try:
audio_file = eyed3.load(settings.input_file_path)
if audio_file is not None and audio_file.tag is not None:
file_artist = audio_file.tag.artist or artist or "Unknown Artist"
file_title = audio_file.tag.title or title
if audio_file.tag.best_release_date:
year = audio_file.tag.best_release_date.year
if audio_file.tag.genre:
genres = audio_file.tag.genre.name
if audio_file.tag.images:
Comment thread
coderabbitai[bot] marked this conversation as resolved.
cover_image_data = audio_file.tag.images[0].image_data
except Exception:
file_artist = artist or "Unknown Artist"
file_title = title
Comment on lines +661 to +676

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check which audio formats are referenced in the codebase
echo "=== Searching for audio format references ==="
rg -i --type=py -C2 '(mp3|flac|ogg|m4a|wav|aac)' -g '!test*' | head -30

echo ""
echo "=== Checking eyed3 format support documentation ==="
# Search for format-related checks or conversions
rg -i --type=py 'is_video_file|audio.*format|file.*extension' -C3

Repository: rakuri255/UltraSinger

Length of output: 9299


🏁 Script executed:

# First, let's look at the broader context around lines 661-676 in UltraSinger.py
sed -n '600,700p' src/UltraSinger.py | cat -n

Repository: rakuri255/UltraSinger

Length of output: 5295


🏁 Script executed:

# Search for where input_file_path is processed before the metadata extraction
rg -n 'input_file_path|eyed3' src/UltraSinger.py | head -40

Repository: rakuri255/UltraSinger

Length of output: 1163


🏁 Script executed:

# Check if there's format detection or conversion before metadata extraction
rg -B10 'eyed3.load' src/UltraSinger.py

Repository: rakuri255/UltraSinger

Length of output: 434


🏁 Script executed:

# Check if there's any eyed3 configuration or documentation in the codebase
rg -i 'eyed3|id3' src/ --type=py -B2 -A2

Repository: rakuri255/UltraSinger

Length of output: 1406


🏁 Script executed:

# Check requirements/dependencies to see eyed3 version
fd -i requirements '\.(txt|in|toml|yaml|yml)$' -exec cat {} \;

Repository: rakuri255/UltraSinger

Length of output: 325


🌐 Web query:

eyed3 library supported audio formats

💡 Result:

The eyeD3 library is primarily designed for working with MP3 files and ID3 metadata [1][2]. According to the official documentation, it explicitly supports the following file types: - MP3 audio files (via eyed3.mp3.Mp3AudioFile) [3] - Raw ID3 data files (via eyed3.id3.TagFile) [3] While the library's design is abstract to allow for potential future support of other formats [1][4], its current core functionality is focused on MP3s and ID3 tags. When using the library's loading functions (such as eyed3.load), it will return None if the file type or MIME type is not recognized [3]. Additionally, eyeD3 supports ID3 metadata versions 1.x (1.0/1.1), 2.2 (read-only), 2.3, and 2.4 [1][5].

Citations:


eyed3 only supports MP3 files with ID3 tags—non-MP3 formats will silently fail to extract metadata.

The eyed3 library is designed exclusively for MP3 files with ID3 tags and returns None for other formats (FLAC, OGG Vorbis, WAV, M4A). When eyed3.load() fails on non-MP3 files, the exception handler falls back to filename-only parsing, causing users to lose available metadata (artist, title, year, genre, cover art) for formats that support tags natively.

Consider adding explicit format validation before attempting ID3 extraction, or use a format-agnostic metadata library for broader audio format support.

🧰 Tools
🪛 Ruff (0.15.12)

[warning] 674-674: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/UltraSinger.py` around lines 661 - 676, The current eyed3-based metadata
extraction (eyed3.load) only handles MP3/ID3 and silently fails for other
formats; update the logic in the metadata block (where eyed3.load is called and
file_artist, file_title, year, genres, cover_image_data are set) to first
validate the audio format (e.g., check file extension or MIME type) and, if not
MP3, use a format-agnostic metadata reader such as mutagen to extract
artist/title/year/genre/cover image, falling back to eyed3 only for MP3; keep
the same variables (file_artist, file_title, year, genres, cover_image_data) and
ensure exceptions still fall back safely to artist/title defaults.

Comment on lines +674 to +676

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace bare except Exception: with specific exception types.

The broad exception handler can mask unexpected errors and make debugging difficult. eyed3.load() may raise IOError, OSError, or eyed3-specific exceptions.

🛡️ Proposed fix to catch specific exceptions
-    except Exception:
+    except (OSError, IOError, AttributeError):
         file_artist = artist or "Unknown Artist"
         file_title = title
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception:
file_artist = artist or "Unknown Artist"
file_title = title
except (OSError, IOError, AttributeError):
file_artist = artist or "Unknown Artist"
file_title = title
🧰 Tools
🪛 Ruff (0.15.12)

[warning] 674-674: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/UltraSinger.py` around lines 674 - 676, The bare "except Exception:"
around the eyed3.load() call should be replaced with explicit exception types
(e.g., IOError, OSError and the eyed3 library's exception class) and capture the
exception into a variable so you can log it before falling back to defaults;
update the except to something like "except (IOError, OSError, eyed3.Error) as
e:" (or the correct eyed3 exception class available in your environment), call
your logger with the exception (e.g., processLogger.error or logging.exception)
and then set file_artist = artist or "Unknown Artist" and file_title = title as
the fallback.


if not file_artist:
file_artist = "Unknown Artist"
if not file_title:
file_title = basename_without_ext
if not genres:
genres = None

basename_without_ext = f"{file_artist} - {file_title}"

song_folder_output_path = os.path.join(settings.output_folder_path, basename_without_ext)
song_folder_output_path = get_unused_song_output_dir(song_folder_output_path)
Comment on lines +685 to 688

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize tag-derived artist/title before using them in filesystem paths.

Line 685 uses raw ID3 values to build directory/file names. Tag values can contain path separators or invalid filename characters, causing path traversal-like folder creation or OS-specific failures.

🛠️ Proposed minimal change
+    invalid_chars = '\\/:*?"<>|'
+    safe_artist = "".join("_" if c in invalid_chars else c for c in file_artist).replace("..", "_").strip().strip(".")
+    safe_title = "".join("_" if c in invalid_chars else c for c in file_title).replace("..", "_").strip().strip(".")
-    basename_without_ext = f"{file_artist} - {file_title}"
+    basename_without_ext = f"{safe_artist} - {safe_title}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/UltraSinger.py` around lines 685 - 688, basename_without_ext is built
from raw tag values (file_artist, file_title) and used to create
song_folder_output_path; sanitize file_artist and file_title before composing
basename_without_ext to prevent path separators or invalid filename characters
from creating unexpected directories or OS errors. Update the code that sets
basename_without_ext (and any uses of file_artist/file_title) to run a sanitizer
function that strips or replaces path separators, control characters and
OS-invalid filename characters (e.g., replace "/" "\" ":" "*" "?" "<" ">" "|"
and nulls with underscores), trims whitespace, and optionally truncates to a
safe length, then use the sanitized values when calling
os.path.join(settings.output_folder_path, ...) and get_unused_song_output_dir.
Ensure the sanitizer is applied consistently wherever tag-derived names are used
for filesystem paths.

Expand Down Expand Up @@ -686,20 +712,20 @@ def infos_from_audio_video_input_file() -> tuple[str, str, str, MediaInfo]:
)
ultrastar_audio_input_path = os.path.join(song_folder_output_path, basename_with_ext)

# Todo: Read ID3 tags
if song_info.cover_image_data is not None:
save_image(song_info.cover_image_data, basename_without_ext, song_folder_output_path)
# Save cover image from ID3 tags
if cover_image_data is not None:
save_image(cover_image_data, basename_without_ext, song_folder_output_path)

return (
basename_without_ext,
song_folder_output_path,
ultrastar_audio_input_path,
MediaInfo(
artist=song_info.artist,
title=song_info.title,
year=song_info.year,
genre=song_info.genres,
cover_url=song_info.cover_url,
artist=file_artist,
title=file_title,
year=year,
genre=genres,
cover_url=None,
audio_extension=audio_ext,
video_extension=video_ext
),
Expand Down