Speed up JavaNIODiskDriver.mapPath()#26
Open
buttercookie42 wants to merge 4 commits intoFileSysOrg:masterfrom
Open
Speed up JavaNIODiskDriver.mapPath()#26buttercookie42 wants to merge 4 commits intoFileSysOrg:masterfrom
buttercookie42 wants to merge 4 commits intoFileSysOrg:masterfrom
Conversation
Opening a file from Windows Explorer results in multiple (possibly five or six) FIND_FIRST2 requests for the selected file. Each FIND_FIRST2 request results in a call to DiskInterface.mapPath(). In a large directory with thousands of files, each mapPath() has a considerable overhead (hundreds of ms), mainly caused by the calls to lastDir.list(), which have to list the whole directory before being able to proceed. For the directories at least that overhead is only incurred when actually needing to explicitly conduct a case-insensitive search, but for the file name currently lastDir.list() is always called unconditionally, which incurs the aforementioned large overhead. To speed up mapPath, I propose doing two things: - Switch the file names to same logic as the directory names, i.e. first we simply check whether the file exists, and only if that fails do we do the potentially more expensive iteration through the whole directory to check for a case-insensitive match. - Seeing how this is the Java*NIO*DiskDriver anyway, rewrite the directory search to actually use the new NIO APIs.
No need to expensively query the file system for a file we know won't exist.
…sensitive If the underlying disk device is known to be case-insensitive, a nonexistent file really is nonexistent, so we can skip trying to traverse the whole directory in search of a possibly matching file with a different case.
Author
|
Another performance optimisation that can be made is when the underlying file system is known to be case-insensitive, such as when running on a Windows host, or Android for the public-facing parts of the internal storage (as well as removable SD cards usually). In that case, a non-existing file really is non-existing and part of checking for a similarly-named file with a difference in casing can be skipped. At least when running on an Android host, for directories with a few thousand files this makes a noticeable difference in performance when creating respectively deleting files, both of which cases try to map non-existing file names either before file creation respectively after file deletion. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
While trying to speed up file system accesses in directories containing a four-digit amount of files, I've noticed that (at least with SMB1) Windows Explorer seems to issue up to 5 or 6 FIND_FIRST2 requests for launching a file, each of which needs to map the path.
In large directories, this seems to incur a noticeable overhead because in the current implementation,
lastDir.list()is unconditionally always called for mapping the filename portion of the path (for the directory portion of the path at least this only happens when we need to fall back to an explicitly case-insensitive search). (I guess SMB2 traffic isn't much affected by this, because there Windows seems to repeatedly open and close the file instead, and theopenFilemethod has a shortcut to not map the path ifFiles.exists()returns true.)To fix this, I propose that
a) mapping the file name should take the "if the file exists, simply declare success" approach, too (a number of callers of
mapPathalready do that as well, too) and only fall back to manually traversing the whole directory for the fallback case-insensitive search, andb) to rewrite the case-insensitive search to a NIO API-based approach, so we don't always have to preemptively traverse the whole directory, but instead only as far as until we (hopefully) find a match.
With the above fixed, I've found another follow-up issue: A wildcard search issued for listing a directory's whole contents always ends up in
mapPathwith the*as the "filename". So even with the above fixes in place, we always end up futilely searching the whole directory for the non-existant"*"file. Adding a shortcut for that case speeds up directory listings a bit, too.With all these changes in place (including the search optimisation from #25), a jfileserver implementation running on my phone is now twice as fast at recursively listing my complete SD card compared to natively running Samba via Termux, even with jfileserver only using SMB1 and Samba using SMB2.