hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joanne Chan <joannec...@gmail.com>
Subject Re: hive query
Date Fri, 12 Aug 2016 14:21:27 GMT
The query is assuming Keyword/Hour is unique which I am not sure if that's
an assumption per requirement.

If not, you'd probably want to group by those two columns.

select k.keyword
     , h.hour
     , sum(coalesce(t.totalcount,0))
from (select distinct keyword from t)
         as k
join (select h.pos as hour from (select posexplode(split(space(22),''))) as
h)       as h
left join t
  on t.keyword = k.keyword
 and t.hour    = h.hour
group by h.hour, k.keyword
order by h.hour, k.keyword
;

Nice trick on the `posexplode(split(space`

On Fri, Aug 12, 2016 at 9:28 AM, Markovitz, Dudu <dmarkovitz@paypal.com>
wrote:

> Hi Raj
>
>
>
> Here is the code.
>
>
>
> Dudu
>
>
>
>
>
> create table t
>
> (
>
>     Keyword         string
>
>    ,Hour            tinyint
>
>    ,TotalCount      bigint
>
> )
>
> ;
>
>  insert into table t values
>
>   ('Iphone' ,11 ,500)
>
> ,('Iphone' ,12,1000)
>
> ,('Samsung',11, 300)
>
> ,('Samsung',12, 600)
>
> ,('Nokia'  ,12, 200)
>
> ,('Iphone' ,16,1500)
>
> ;
>
>
>
> select          k.keyword
>
>                ,h.hour
>
>                ,t.totalcount
>
>
>
> from                            (select distinct keyword from
> t)                                                as k
>
>
>
>                  cross join     (select h.pos as hour from (select
> posexplode(split(space(22),''))) as h)       as h
>
>
>
>                  left join      t
>
>
>
>                  on             t.keyword       =
>
>                                 k.keyword
>
>
>
>                             and t.hour          =
>
>                                 h.hour
>
>
>
> order by        h.hour
>
>                ,k.keyword
>
> ;
>
>
>
>
>
> keyword
>
> hour
>
> totalcount
>
> Iphone
>
> 0
>
>
>
> Nokia
>
> 0
>
>
>
> Samsung
>
> 0
>
>
>
> Iphone
>
> 1
>
>
>
> Nokia
>
> 1
>
>
>
> Samsung
>
> 1
>
>
>
> Iphone
>
> 2
>
>
>
> Nokia
>
> 2
>
>
>
> Samsung
>
> 2
>
>
>
> Iphone
>
> 3
>
>
>
> Nokia
>
> 3
>
>
>
> Samsung
>
> 3
>
>
>
> Iphone
>
> 4
>
>
>
> Nokia
>
> 4
>
>
>
> Samsung
>
> 4
>
>
>
> Iphone
>
> 5
>
>
>
> Nokia
>
> 5
>
>
>
> Samsung
>
> 5
>
>
>
> Iphone
>
> 6
>
>
>
> Nokia
>
> 6
>
>
>
> Samsung
>
> 6
>
>
>
> Iphone
>
> 7
>
>
>
> Nokia
>
> 7
>
>
>
> Samsung
>
> 7
>
>
>
> Iphone
>
> 8
>
>
>
> Nokia
>
> 8
>
>
>
> Samsung
>
> 8
>
>
>
> Iphone
>
> 9
>
>
>
> Nokia
>
> 9
>
>
>
> Samsung
>
> 9
>
>
>
> Iphone
>
> 10
>
>
>
> Nokia
>
> 10
>
>
>
> Samsung
>
> 10
>
>
>
> Iphone
>
> 11
>
> 500
>
> Nokia
>
> 11
>
>
>
> Samsung
>
> 11
>
> 300
>
> Iphone
>
> 12
>
> 1000
>
> Nokia
>
> 12
>
> 200
>
> Samsung
>
> 12
>
> 600
>
> Iphone
>
> 13
>
>
>
> Nokia
>
> 13
>
>
>
> Samsung
>
> 13
>
>
>
> Iphone
>
> 14
>
>
>
> Nokia
>
> 14
>
>
>
> Samsung
>
> 14
>
>
>
> Iphone
>
> 15
>
>
>
> Nokia
>
> 15
>
>
>
> Samsung
>
> 15
>
>
>
> Iphone
>
> 16
>
> 1500
>
> Nokia
>
> 16
>
>
>
> Samsung
>
> 16
>
>
>
> Iphone
>
> 17
>
>
>
> Nokia
>
> 17
>
>
>
> Samsung
>
> 17
>
>
>
> Iphone
>
> 18
>
>
>
> Nokia
>
> 18
>
>
>
> Samsung
>
> 18
>
>
>
> Iphone
>
> 19
>
>
>
> Nokia
>
> 19
>
>
>
> Samsung
>
> 19
>
>
>
> Iphone
>
> 20
>
>
>
> Nokia
>
> 20
>
>
>
> Samsung
>
> 20
>
>
>
> Iphone
>
> 21
>
>
>
> Nokia
>
> 21
>
>
>
> Samsung
>
> 21
>
>
>
> Iphone
>
> 22
>
>
>
> Nokia
>
> 22
>
>
>
> Samsung
>
> 22
>
>
>
> Iphone
>
> 23
>
>
>
> Nokia
>
> 23
>
>
>
> Samsung
>
> 23
>
>
>
>
>
>
>
> *From:* raj hive [mailto:raj.hiveql@gmail.com]
> *Sent:* Friday, August 12, 2016 3:28 PM
> *To:* user@hive.apache.org
> *Subject:* hive query
>
>
>
> Dear Friends,
>
> I have a hive table with column name Keyword,Hour,TotalTweets.
>
> for example, I have the date for three keyword as below.
>
> *Keyword    Hour    TotalCount*
>
>  iphone       11           500
>
>  iphone       12           1000
>
>  Samsung   11           300
>  Samsung   12           600
>
>  Nokia         12           200
>
>  Iphone       16          1500
>
>  I want a query to get output for 24 hours like below. I need to show the
> zero count if i don't have the data. Can anyone help me the hive query.
>
>
>
> *Keyword*
>
> *hour*
>
> *TotalCount*
>
> iphone
>
> 0
>
> 0
>
> samsung
>
> 0
>
> 0
>
> nokia
>
> 0
>
> 0
>
> iphone
>
> 1
>
> 0
>
> samsung
>
> 1
>
> 0
>
> nokia
>
> 1
>
> 0
>
> iphone
>
> 2
>
> 0
>
> samsung
>
> 2
>
> 0
>
> nokia
>
> 2
>
> 0
>
> iphone
>
> 3
>
> 0
>
> samsung
>
> 3
>
> 0
>
> nokia
>
> 3
>
> 0
>
> iphone
>
> 4
>
> 0
>
> samsung
>
> 4
>
> 0
>
> nokia
>
> 4
>
> 0
>
> iphone
>
> 5
>
> 0
>
> samsung
>
> 5
>
> 0
>
> nokia
>
> 5
>
> 0
>
> iphone
>
> 6
>
> 0
>
> samsung
>
> 6
>
> 0
>
> nokia
>
> 6
>
> 0
>
> iphone
>
> 7
>
> 0
>
> samsung
>
> 7
>
> 0
>
> nokia
>
> 7
>
> 0
>
> iphone
>
> 8
>
> 0
>
> samsung
>
> 8
>
> 0
>
> nokia
>
> 8
>
> 0
>
> iphone
>
> 9
>
> 0
>
> samsung
>
> 9
>
> 0
>
> nokia
>
> 9
>
> 0
>
> iphone
>
> 10
>
> 0
>
> samsung
>
> 10
>
> 0
>
> nokia
>
> 10
>
> 0
>
> iphone
>
> 11
>
> 500
>
> samsung
>
> 11
>
> 300
>
> nokia
>
> 11
>
> 0
>
> iphone
>
> 12
>
> 1000
>
> samsung
>
> 12
>
> 600
>
> nokia
>
> 12
>
> 200
>
> iphone
>
> 13
>
> 0
>
> samsung
>
> 13
>
> 0
>
> nokia
>
> 13
>
> 0
>
> iphone
>
> 14
>
> 0
>
> samsung
>
> 14
>
> 0
>
> nokia
>
> 14
>
> 0
>
> iphone
>
> 15
>
> 0
>
> samsung
>
> 15
>
> 0
>
> nokia
>
> 15
>
> 0
>
> iphone
>
> 16
>
> 0
>
> samsung
>
> 16
>
> 0
>
> nokia
>
> 16
>
> 1500
>
> iphone
>
> 17
>
> 0
>
> samsung
>
> 17
>
> 0
>
> nokia
>
> 17
>
> 0
>
> iphone
>
> 18
>
> 0
>
> samsung
>
> 18
>
> 0
>
> nokia
>
> 18
>
> 0
>
> iphone
>
> 19
>
> 0
>
> samsung
>
> 19
>
> 0
>
> nokia
>
> 19
>
> 0
>
> iphone
>
> 20
>
> 0
>
> samsung
>
> 20
>
> 0
>
> nokia
>
> 20
>
> 0
>
> iphone
>
> 21
>
> 0
>
> samsung
>
> 21
>
> 0
>
> nokia
>
> 21
>
> 0
>
> iphone
>
> 22
>
> 0
>
> samsung
>
> 22
>
> 0
>
> nokia
>
> 22
>
> 0
>
> iphone
>
> 23
>
> 0
>
> samsung
>
> 23
>
> 0
>
> nokia
>
> 23
>
> 0
>
>
>
>
>

Mime
View raw message