hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lefty Leverenz <leftylever...@gmail.com>
Subject Re: complex datatypes filling
Date Fri, 17 Jan 2014 10:06:12 GMT
Here's the wikidoc for transform:  Transform/Map-Reduce
Syntax<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Transform>
.

-- Lefty


On Thu, Jan 16, 2014 at 10:44 PM, Bogala, Chandra Reddy <
Chandra.Bogala@gs.com> wrote:

> Thanks for quick reply. I will take a look at stream job and transform
> functions.
>
> One more question:
>
> I have multiple csv files ( same structure, dir added as partition) mapped
> to hive table. Then I run different group by jobs on same data like below.
> All these are spanned as different jobs. So multiple mappers read/fetch
> data from disk and then computes different group/aggregation jobs.
>
> Each below job fetch same data from disk. Can this be avoided by reading
> split only once and mapper computing different group by jobs in same mapper
> itself. That may no of mappers will come down drastically and also mainly
> multiple disk seeks for same data avoided. Do I need to write custom map
> reduce job to do this?
>
>
>
> 1)      Insert into temptable1 select TAG,col2,SUM(col5) as
> SUM_col5,SUM(col6) as SUM_col6,SUM(col7) as SUM_col7,ts  from
> raw_data_by_epoch where ts=${hivevar:collectiontimestamp} group by
> TAG,col2,TS
>
>
>
> 2)      Insert into temptable2 select TAG,col2,col3,SUM(col5) as
> SUM_col5,SUM(col6) as SUM_col6,SUM(col7) as SUM_col7,ts  from
> raw_data_by_epoch where ts=${hivevar:collectiontimestamp} group by
> TAG,col2,col3,TS
>
>
>
> 3)      Insert into temptable3 select TAG,col2,col3,col4,SUM(col5) as
> SUM_col5,SUM(col6) as SUM_col6,SUM(col7) as SUM_col7,ts  from
> raw_data_by_epoch where ts=${hivevar:collectiontimestamp} group by
> TAG,col2,col3,col4,TS
>
>
>
> Thanks,
>
> Chandra
>
>
>
> *From:* Stephen Sprague [mailto:spragues@gmail.com]
> *Sent:* Friday, January 17, 2014 11:39 AM
> *To:* user@hive.apache.org
> *Subject:* Re: complex datatypes filling
>
>
>
> remember you can always setup a stream job to do any wild and crazy custom
> thing you want. see the tranform() function documentation.  Its really
> quite easy. honest.
>
>
>
> On Thu, Jan 16, 2014 at 9:39 PM, Bogala, Chandra Reddy <
> Chandra.Bogala@gs.com> wrote:
>
> Hi,
>
>   I found lot of examples to map json data into hive complex data types
> (map, array , struct etc). But I don’t see anywhere filling complex data
> types with nested sql  query ( I.e group by few columns(key) and array of
> struct(multiple columns) containing  result values ).
>
> So that it will be easy for me to map back into embedded/nested json
> document.
>
>
>
> Thanks,
>
> Chandra
>
>
>

Mime
View raw message