hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayank Bansal <>
Subject Percentile calculation
Date Mon, 01 Oct 2012 08:50:31 GMT

I am trying to run the hive udf percentile, I am trying to run it on a column with something
around 116 million unique values.
The maximum space that I can give to the reducer is 12 GB, the job keeps on failing due to
java heap space error.
Is there a way to optimize this, so that I don't encounter this error?
Or any other suggestion or solution which could help me out?


This email message may contain proprietary, private and confidential information. The information
transmitted is intended only for the person(s) or entities to which it is addressed. Any review,
retransmission, dissemination or other use of, or taking of any action in reliance upon, this
information by persons or entities other than the intended recipient is prohibited and may
be illegal. If you received this in error, please contact the sender and delete the message
from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic communications are free
from viruses. However, given Internet accessibility, the Company cannot accept liability for
any virus introduced by this e-mail or any attachment and you are advised to use up-to-date
virus checking software.

View raw message