-
Notifications
You must be signed in to change notification settings - Fork 48
Add unused assets checker script #180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,160 @@ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import fs from "fs"; | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import path from "path"; | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| import { globSync } from "glob"; | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| /** | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * UNUSED ASSETS CHECKER | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * This is a VERY NAIVE test that identifies potentially unused assets in the build folder. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * How it works: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * 1. Scans all files in the build directory | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * 2. Excludes files matching the EXCLUDE_PATTERNS (default: HTML files) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * 3. Creates a mapping of base filenames to their full paths | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * 4. Searches through all build files for references to each base filename | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * 5. Reports any files whose base names are never mentioned anywhere | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * Limitations (under reporting): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * 1. If the basename of an asset is mentioned anywhere (even if not actually used), it will NOT be reported | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * Limitations (over reporting): | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * 1. If another website or external source references the asset, it will be reported as unused | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * 2. If the asset is referenced in a way that does not include the base filename (e.g., dynamically constructed paths), it will be reported as unused | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * 3. If the asset is referenced with URL encoding or special characters, it may be reported as unused | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+17
to
+24
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * This is useful for finding obviously unused files, but manual review is recommended | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| * before deleting anything identified by this test. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| */ | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| const BUILD_DIR = path.join(process.cwd(), "build"); | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // Configurable: Regular expressions to exclude files from being checked as potential unused assets | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // By default, we exclude HTML files since they are the primary content, not assets | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // If there are assets in this site which are not referenced by filename | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| // If there are assets in this site which are not referenced by filename |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The binary file exclusion pattern is missing several common binary file types that could cause the script to fail or behave unexpectedly when attempting to read them as UTF-8. Consider adding extensions like .webp, .svg, .pdf, .zip, .mp4, .webm, .mp3, .wav, etc.
Also note that .svg files are actually text-based XML and could be searched, so they might want to be excluded from the binary list while other formats should be added.
| if (/\.(png|jpg|jpeg|gif|ico|woff|woff2|ttf|eot|otf)$/i.test(file)) { | |
| if (/\.(png|jpg|jpeg|gif|ico|webp|pdf|zip|mp4|webm|mp3|wav|woff|woff2|ttf|eot|otf)$/i.test(file)) { |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using content.includes(baseName) for searching could produce false positives if the base name is a substring of other content (e.g., searching for "app.js" would match "webapp.js" or "my-app.js-backup"). While this is acceptable for the stated "naive" approach and the script already documents this as a limitation, consider documenting this specific case in the limitations section at the top of the file.
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterating over all remaining base names for every file could be inefficient when there are many assets and files. The current implementation has O(n*m) complexity where n is the number of files and m is the number of unreferenced assets. Consider optimizing by:
- Building a single search pattern with all base names (e.g., using a regex with alternation)
- Or at minimum, converting the
unreferenced.keys()to an array once before the outer loop to avoid repeated iterator creation
| // Check each remaining base name to see if it appears in this file | |
| for (const baseName of unreferenced.keys()) { | |
| if (content.includes(baseName)) { | |
| unreferenced.delete(baseName); | |
| // Early exit if we've found all files | |
| if (unreferenced.size === 0) { | |
| return unreferenced; | |
| } | |
| } | |
| } | |
| // Build a regex that matches any of the remaining base names | |
| const baseNames = Array.from(unreferenced.keys()); | |
| if (baseNames.length === 0) { | |
| return unreferenced; | |
| } | |
| // Escape regex special characters in base names | |
| const escapedBaseNames = baseNames.map(name => name.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')); | |
| const pattern = new RegExp(escapedBaseNames.join('|'), 'g'); | |
| const matches = new Set(); | |
| let match; | |
| while ((match = pattern.exec(content)) !== null) { | |
| matches.add(match[0]); | |
| } | |
| for (const found of matches) { | |
| unreferenced.delete(found); | |
| } | |
| // Early exit if we've found all files | |
| if (unreferenced.size === 0) { | |
| return unreferenced; | |
| } |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main() function doesn't check if the BUILD_DIR exists before scanning. If the build directory doesn't exist (e.g., before running yarn build), the script will fail ungracefully. Consider adding a check and providing a helpful error message:
if (!fs.existsSync(BUILD_DIR)) {
console.error(`❌ Error: Build directory not found at ${BUILD_DIR}`);
console.error('ℹ️ Please run "yarn build" first.');
process.exit(1);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The
test:unused-assetsscript is not following the same pattern as other test scripts. Whiletest:html-validateandtest:dirty-file-paths-checkeraccept command-line arguments through"$@", this script doesn't pass arguments. For consistency, consider using:This would maintain consistency even if the unused-assets script doesn't currently use the arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is intentional because you must always consider the whole site to figure out which assets are not being used