hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <go...@hortonworks.com>
Subject Re: Hi, Hive People urgent question about [Distribute By] function
Date Thu, 22 Oct 2015 16:25:41 GMT

> When applying [Distribute By] on Hive to the framework, the function
>should be partitionByHash on Flink. This is to spread out all the rows
>distributed by a hash key from Object Class in Java.

Hive does not use the Object hashCode - the identityHashCode is
inconsistent, so Object.hashCode() .

ObjectInspectorUtils::hashCode() is the hashcode used by the DBY in hive
(SORT BY uses a Random number generator).

Cheers,
Gopal


Mime
View raw message