hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Natarajan, Prabakaran 1. (NSN - IN/Bangalore)" <prabakaran.1.natara...@nsn.com>
Subject High performance Count Distinct - NO Error
Date Wed, 06 Aug 2014 08:52:37 GMT

I am looking for high performance count distinct solution on Hive Query.

Regular count distinct is very slow but if I use probabilistic count distinct has more error
percentage (if the number of records are small).

Is there is any solution to have exact count distinct but using low memory and without error?

Thanks and Regards

View raw message