Fix handling of unsigned arithmetic.#2
Conversation
|
I haven't had time to add a test case myself, unfortunately, that specifically highlights this problem. Should be straightforward to write, though (see the note above regarding trailing 0xff bytes, for example). That said: I'd suggest taking a look at this. The existing implementation works for simple test cases, but will often compute incorrect results on real data. (It may even fail to prevent hash flooding, but I haven't convinced myself of that.) |
|
I found the same problem. Note: the length can not be negative, so the change "((long) data.length) & 0xff" is not needed. It is unfortunate the that official SipHash test vectors doesn't include randomly generated data... |
|
By the way, you may also want to fix "b[0] = (byte) (l & 0xff);" and so on (for those cases, the "& 0xff" is not needed). |
... eliminate noise from the PR diff.
|
Good point. Reverted the length change and some whitespace noise. Leaving the b[0] code alone for this PR. |
| @@ -27,19 +27,19 @@ private static long lastBlock(byte[] data, int iter) { | |||
|
|
|||
There was a problem hiding this comment.
There is one more change for this PR:
diff --git a/src/main/java/com/github/emboss/siphash/SipHash.java b/src/main/java/com/github/emboss/siphash/SipHash.java
index 9f31287..00c4640 100644
--- a/src/main/java/com/github/emboss/siphash/SipHash.java
+++ b/src/main/java/com/github/emboss/siphash/SipHash.java
@@ -21,7 +21,7 @@ public class SipHash {
}
private static long lastBlock(byte[] data, int iter) {
- long last = ((long) data.length) << 56;
+ long last = (((long) data.length) & 0xff) << 56;
int off = iter * 8;
switch (data.length % 8) {Credit: Bryan Silverthorn, emboss/siphash-java#2
The existing implementation unfortunately falls prey to Java signed byte arithmetic in a couple of places. The hash of almost any odd-length buffer involving trailing 0xff bytes, for example, will not match the SipHash reference C implementation. This problem also makes it easy to generate collisions.
This pull request applies the typical Java unsigned-math workaround where necessary. The output now matches the reference implementation in my testing.
Apologies for the couple of accidental whitespace changes.