hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Malak <michaelma...@yahoo.com>
Subject Re: how to combine some rows into 1 row in hive
Date Sun, 23 Jun 2013 21:40:50 GMT
It's really more of a HiveQL trick than a SQL trick.  In SQL, one would express it as a subquery,
which is more straightforward and readable, and has the additional advantage that you could
eliminate the aggregation and assume P and N are not true simultaneously, which would nicely
produce an error were that assumption ever violated.  HiveQL's lack of subqueries forces
this trick to use aggregation to simulate a merge.
 

________________________________
 From: Stephen Sprague <spragues@gmail.com>
To: user@hive.apache.org 
Sent: Sunday, June 23, 2013 1:20 PM
Subject: Re: how to combine some rows into 1 row in hive
  


so its not a Hive issue at all then instead its clever sql tricks.  good luck maintaining
that and hopefully you got some comments in your code for the next sap who comes along and
has the privilege of reading  it. :)   





On Sat, Jun 22, 2013 at 5:38 PM, zuohua zhang <zuohua@gmail.com> wrote:

Thanks Michael! That worked without modification!
>
>
>
>On Sat, Jun 22, 2013 at 5:05 PM, Michael Malak <michaelmalak@yahoo.com> wrote:
>
>Or, the single-language (HiveQL) alternative might be (i.e. I haven't tested it): 
>> 
>>select f1,
>>       f2, 
>>       if(max(if(f3='P',f4,null)) is null,0,max(if(f3='P',f4,null))) pf4,      
if(max(if(f3='P',f5,null)) is null,0,max(if(f3='P',f5,null))) pf5, 
>>       if(max(if(f3='N',f4,null)) is null,0,max(if(f3='N',f4,null))) nf4,
>>       if(max(if(f3='N',f5,null)) is null,0,max(if(f3='N',f5,null))) nf5 
>>from   mytable
>>group by f1, f2;
>> 
>>
>> 
>> From: Edward Capriolo <edlinuxguru@gmail.com>
>>To: "user@hive.apache.org" <user@hive.apache.org> 
>>Sent: Saturday, June 22, 2013 7:48 PM
>>Subject: Re: how to combine some rows into 1 row in hive
>> 
>>
>>
>>Using hives streaming feature is a nice option for this as it is a fairly natural
way to work with entire rows and return multiple columns
>>
>>select transform a,b,c,d  using /bin/pipeprogram as a, b ,c ,d
>>
>>
>>
>>You an also write a UDTF user defined Table function as well because this can return
more then one column.
>>
>>
>>
>>On Sat, Jun 22, 2013 at 6:36 PM, zuohua zhang <zuohua@gmail.com> wrote:
>>
>>I have the following table:
>>>f1 f2 f3 f4 f5
>>>a1 a2 P x1 x2 
>>>a1 a2 N x3 x4
>>>a1 a3 N x5 x6
>>>a4 a6 P x7 x8
>>>
>>>
>>>i want to convert to below:
>>>f1 f2 pf4 pf5 nf4 nf5
>>>a1 a2 x1 x2 x3 x4
>>>a1 a3 0 0 x5 x6
>>>a4 a6 x7 x8 0 0
>>>
>>>
>>>basically, when f3="P", I want f4 f5 to be moved to pf4 pf5 
>>>when f3="N", i want f4 f5 to be moved to nf4 nf5
>>>when there is no "P" or "N" rows for the record, I want to fill in zeros in the
corresponding fields.
>>>
>>>
>>>how to do it?
>>>
>>>
>>>Thanks,
>>>Zuohua
Mime
View raw message