spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] mundaym opened a new pull request #29762: [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms.
Date Tue, 15 Sep 2020 13:56:52 GMT

mundaym opened a new pull request #29762:
URL: https://github.com/apache/spark/pull/29762


   MurmurHash3 and xxHash64 interpret sequences of bytes as integers
   encoded in little-endian byte order. This requires a byte reversal
   on big endian platforms.
   
   I've left the hashInt and hashLong functions as-is for now. My
   interpretation of these functions is that they perform the hash on
   the integer value as if it were serialized in little-endian byte
   order. Therefore no byte reversal is necessary.
   
   ### What changes were proposed in this pull request?
   Modify hash functions to produce correct results on big-endian platforms.
   
   ### Why are the changes needed?
   Hash functions produce incorrect results on big-endian platforms which, amongst other potential
issues, causes test failures.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Existing tests run on the IBM Z (s390x) platform which uses a big-endian byte order.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message