mlxml: preserve percent-encoding in URI paths#21
mlxml: preserve percent-encoding in URI paths#21ssahani wants to merge 1 commit intolibguestfs:masterfrom
Conversation
| @@ -0,0 +1,124 @@ | |||
| From ea4c85204a87e0156f1d22d845844b4a4ad747cc Mon Sep 17 00:00:00 2001 | |||
|
I don't understand why this change is needed at all. Can you explain what problem you're trying to solve, ideally with examples. In general, trying to hand-parse URIs is a big code smell. Libraries (like libxml2) exist to take care of the many details, standards and complications around URIs. Therefore the barrier to entry for this code is much larger than normal since it is doing something that is probably inadvisable. And since I don't understand the purpose of this code, I cannot right now think about accepting it. |
f28d178 to
ae524c7
Compare
|
This file name is getting changed into This path does not exist . xmlParseURI automatically decodes percent-encoded characters (e.g. %2f becomes /), which corrupts paths where %2f is a literal part of the filename on VMFS/ESXi filesystems. xmlParseURIRaw with raw=1 preserves the percent-encoding so the path is passed to SFTP exactly as the user specified it. |
|
This changes the semantics of this API. You'd need to audit every user that's calling parse_uri throughout libguestfs code to see if they are passing in escaped URIs or not. If they are expecting the URI to be unescaped, this patch breaks those callers. If the path is literally As other patches have addressed, vmware has obvious bugs here. So IMO probably better to finding whereever this path is passed to parse_uri in our vmware specific code, and URI escaping it first. |
|
The proposed commit now changes mlxml is essentially a mini-binding around the xml* functions from libxml2, as it says in the documentation. If a binding around |
24ce106 to
b9f7d81
Compare
|
Thanks for the reviews. Updated . |
|
It's OK, but the whole of the code building the tuple is now duplicated. It'd be nice not to have that. It's hard to understand from the libxml2 docs what The libxml2 function isn't used anywhere else as far as I can see. But glib's |
Add a new parse_uri_raw binding that wraps xmlParseURIRaw(str, raw). When raw is true, percent-encoding in URI paths is preserved. On VMFS/ESXi, %2f is literal in filenames and must not be decoded to '/'. The existing parse_uri binding remains unchanged. Fixes: https://issues.redhat.com/browse/RHEL-136481 Signed-off-by: Susant Sahani <ssahani@redhat.com>
b9f7d81 to
63bcae2
Compare
|
Indeed. Updated. Thanks for the review. |
xmlParseURI automatically decodes percent-encoded characters in the path component, which causes issues when paths contain %2f (encoded forward slash). For example, a VMX file path containing '%2f' would be decoded to '/', creating an invalid double-slash in the path.
Extract the raw path from the original URI string before xmlParseURI processes it, preserving percent-encoding like %2f. This allows proper handling of paths with special characters that need to remain encoded.
Fixes: https://issues.redhat.com/browse/RHEL-136481