hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Pratap Singh <manu.i...@gmail.com>
Subject Re: Percentage of rows in a Hive Table
Date Wed, 28 Mar 2012 18:39:35 GMT
I don't know how it much can help.

Select * from TABLE_DATA order by ROW_NAME DESC limit COUNT. Here
calculation of count as top 5% is bit tricky. I don't think so this
calculation can even be done in single query.

Regards,
Abhishek

On Wed, Mar 28, 2012 at 9:09 AM, James Newhaven <james.newhaven@gmail.com>wrote:

> Thanks for the suggestion.
>
> I don't think sampling helps here, as I need to get the top 5% of rows
> ordered by a particular column (not a random sampling)
>
>
>
> On Wed, Mar 28, 2012 at 5:03 PM, Gabi D <gabid33@gmail.com> wrote:
>
>> James,
>> See if sampling
>> <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Sampling>is
>> what you need
>>
>>
>>
>>
>> On Wed, Mar 28, 2012 at 5:53 PM, James Newhaven <james.newhaven@gmail.com
>> > wrote:
>>
>>> I am trying to write a query that will return the first 5% of rows in a
>>> table.
>>>
>>> I've struggled with this for quite a while and can't figure out a
>>> command that works in Hive.
>>>
>>> Has anyone done this?
>>>
>>> Thanks,
>>> James
>>>
>>
>>
>

Mime
View raw message