[library] Refine oversized Excel export handling#67
Conversation
|
@codex review |
|
To use Codex here, create a Codex account and connect to github. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for df, sheet in zip(dataframes, sheets): | ||
| sanitized_sheet: str = self._sanitize_name(sheet) | ||
| base_filename: str = f"{datetime_string}_{file_name}_{sanitized_sheet}" | ||
|
|
||
| parquet_path: Path = self._build_output_path( |
There was a problem hiding this comment.
Prevent overwriting when sanitized sheet names collide
Output filenames are derived from self._sanitize_name(sheet) and then reused for both Parquet and CSV exports, but there is no guarantee that the sanitized names are unique. If two sheets differ only by characters that _sanitize_name strips (e.g. "Sales#1" vs "Sales@1"), both exports target the same base_filename, so the later write silently replaces the earlier files and those data are lost. Consider de‑duplicating sanitized names or appending a counter when a collision occurs before writing.
Useful? React with 👍 / 👎.
Summary
Testing
Codex Task