hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "j.barrett Strausser" <j.barrett.straus...@gmail.com>
Subject Re: Semantics of Rank.
Date Thu, 25 Jul 2013 03:32:55 GMT
As an example : If I run my query above removing the arg the following is
thrown.

FAILED: SemanticException Failed to breakup Windowing invocations into
Groups. At least 1 group must only depend on input columns. Also check for
circular dependencies.
Underlying error: org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException:
One or more arguments are expected.


Similar issue and fix here:
http://www.marshut.com/rqvpz/use-rank-over-partition-function-in-hive-11.html

Even if it didn't require an arg it still doesn't explain my anomalous
output.



On Wed, Jul 24, 2013 at 11:28 PM, j.barrett Strausser <
j.barrett.strausser@gmail.com> wrote:

> That isn't true. If you try to run the above HIVE without an argument, it
> will throw an exception. I have seen other users replicate this problem as
> well.
>
> I can file a JIRA if someone can confirm that my query should work.
>
>
> On Wed, Jul 24, 2013 at 11:02 PM, manishbhoge@rocketmail.com <
> manishbhoge@rocketmail.com> wrote:
>
>> Analytical function doesn't expect any argument. Rank() itself enough to
>> sequence based on the window you have defined in partition by. So
>>
>> Rank() over (partition by cmscustid  order by orderdate)
>>
>> Should work as long as I have wrote right syntax for hive.
>>
>> Sent via Rocket from my HTC
>>
>> ----- Reply message -----
>> From: "j.barrett Strausser" <j.barrett.strausser@gmail.com>
>> To: <user@hive.apache.org>
>> Subject: Semantics of Rank.
>> Date: Thu, Jul 25, 2013 1:08 AM
>>
>>
>> Thanks for the reply. Perhaps my misunderstanding of the relation between
>> rank and the windowing function is wrong.
>>
>> What I want to achieve for the following is : For a given customer id,
>> sort his orders. I thought the below would work.
>>
>> SELECT eh.cmsorderid, eh.orderdate, RANK(orderdate) w FROM order_data eh
>> window w as (partition by cmscustid  order by orderdate);
>>
>> The rank function instead returns the rank of the order date over all all
>> order dates.
>>
>> Example snippet from above
>>
>> Actual :
>>
>> 6758783    27APR2012    94
>> 6758783    23JUN2012    95
>> 6758785    14DEC2012    96
>> 6758795    18DEC2011    97
>> 6758796    06MAY2012    98
>> 6758798    24MAR2013    99
>> 6758799    23NOV2012    100
>>
>>
>> Expected :
>>
>> 6758783    27APR2012    1
>> 6758783    23JUN2012    2
>> 6758785    14DEC2012    1
>> 6758795    18DEC2011    1
>> 6758796    06MAY2012    1
>> 6758798    24MAR2013    1
>> 6758799    23NOV2012    1
>>
>>
>> -b
>>
>>
>>
>>
>> On Wed, Jul 24, 2013 at 3:17 PM, Shahar Glixman <sglixman@outbrain.com
>> >wrote:
>>
>> > the argument to rank is simply some value, whereas the rank function
>> > compare this value
>> >  to the previous value received, if value is same, rank returns ++index,
>> > otherwise, rank return 1.
>> > pseudo code:
>> >
>> > class Rank {
>> > int index;
>> > Object previousValue = null;
>> > int evaluate(Object value) {
>> >   if (value == previousValue) {
>> >     return ++index;
>> >   }
>> >   previousValue = value;
>> >   index = 1;
>> >   return 1;
>> > }
>> >
>> >
>> > On Wed, Jul 24, 2013 at 9:59 PM, j.barrett Strausser <
>> > j.barrett.strausser@gmail.com> wrote:
>> >
>> >> It seems as though I am required to pass in an argument to RANK().
>> >>
>> >> What is the effect of passing this argument in ?
>> >>
>> >> In a RANK function the output for RANK should be the number of rows
>> >> preceding
>> >> a row in a given window. As  windows are specified by the partition and
>> >> order by fields I don't understand the effect of passing in an arg to
>> RANK
>> >>
>> >> Are the other non-aggregate function similar? Meaning : Cume_dist,
>> Ntile
>> >> --
>> >>
>> >>
>> >> https://github.com/bearrito
>> >> @deepbearrito
>> >>
>> >
>> >
>> > The above terms reflect a potential business arrangement, are provided
>> solely
>> > as a basis for further discussion, and are not intended to be and do not
>> > constitute a legally binding obligation. No legally binding obligations
>> will
>> > be created, implied, or inferred until an agreement in final form is
>> executed
>> > in writing by all parties involved.
>> >
>> > This email and any attachments hereto may be confidential or privileged.
>> >  If you received this communication by mistake, please don't forward it
>> > to anyone else, please erase all copies and attachments, and please let
>> > me know that it has gone to the wrong person. Thanks.
>> >
>>
>>
>>
>> --
>>
>>
>> https://github.com/bearrito
>> @deepbearrito
>>
>
>
>
> --
>
>
> https://github.com/bearrito
> @deepbearrito
>



-- 


https://github.com/bearrito
@deepbearrito

Mime
View raw message