Continue Add scan file body endpoint #83
Open
TrevisGordan wants to merge 7 commits intoelement-hq:mainfrom
Open
Continue Add scan file body endpoint #83TrevisGordan wants to merge 7 commits intoelement-hq:mainfrom
TrevisGordan wants to merge 7 commits intoelement-hq:mainfrom
Conversation
Contributor
Author
|
@reivilibre Please check out the changes and share your feedback. |
Contributor
|
@TrevisGordan This branch has conflicts :) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Continues #79 by @Half-Shot, which implements #78 — the ability to scan a file before uploading it to Matrix.
PR #79 was closed as Half-Shot shifted away from the content scanner (comment). This PR picks up the work, addresses the review feedback from @reivilibre, and restructures the implementation.
What changed from #79
Instead of adding the new
scan_filefunctionality on top of the existing code, I refactored the scanner so that all scan paths follow the same pipeline:Key change: no more streaming to disk for multipart uploads. In #79,
write_multipart_to_diskstreamed the upload to disk, but since ~99% of files will be encrypted, the content would immediately be read back into memory for decryption and then written to disk again. The streaming benefit was entirely negated. The multipart body is now read into memory once in the servlet layer and passed as raw bytes to the scanner.Scanner changes
scan_file_on_diskrenamed toscan_content— now accepts raw bytes instead of a file path._do_scan— extracted shared scan logic (mimetype check, run scan, cleanup) that was previously duplicated between_scan_mediaandscan_file_on_disk.write_multipart_to_disk— no longer needed.aiofilesin_write_file_to_disk.Bug fix from #79
write_multipart_to_diskwas called from the servlet with aBodyPartReader, butscan_file_on_diskreceived the resulting file path as itsfile_pathparameter — then passed it to_decrypt_filewhich expected bytes, not a path. This is fixed by the restructuring above.Documentation
Improved
docs/api.mdfor thescan_fileendpoint:curland Python usage examples.Suggestions for follow-up or now
scan_fileendpoint toscan_uploadorscan_contentto avoid confusion with the internalscan_file/scan_contentscanner methods.scan_media— the media path is used as filename but the file is deleted after scanning anyway.fileparameter naming — confusing, but inherited from the Matrix client-server specEncryptedFilestructure.