hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manish <manishbh...@rocketmail.com>
Subject Re: Map issue in Hive.
Date Fri, 21 Sep 2012 15:30:57 GMT
Hey Bejoy! Thanks a ton.

Things are so easy in Hive :) 

This is how my SQL looks like after defining Map (associative Array). 

select pv.c_14["+UserType"] from page_view_tmp_2 pv where
pv.c_14["+LastLogin"] IS NOT NULL

Thanks Again,
Manish.

On Fri, 2012-09-21 at 00:45 -0700, Bejoy KS wrote:
> Hey Manish
> 
> 
> Sorry If my post was not clear. You need to use either Array or Map
> for that based on the data it holds. looking at your sample data 
> 
> 
> ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%
> 7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;
> +LastLogin=9/11/2012+12:00:01+AM
> 
> 
> 
> I assume it need to be split like this, which is of the format key '='
> value and key value pairs are separated by ';' . 
> 
> 
> 
> ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;
> +Rviewd=;
> +UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;
> +UserType=G;
> +LastLogin=9/11/2012+12:00:01+AM
> 
> 
> 
> So you can have Map as the column data type and DDL should be like  .
> 
> 
> COLLECTION ITEMS TERMINATED BY ';'
> MAP KEYS TERMINATED BY '='  
> 
> 
> 
>   Hope it is clear now :)
> 
> 
> Regards,
> Bejoy KS
> 
> 
> 
> ______________________________________________________________________
> From: Manish.Bhoge <Manish.Bhoge@target.com>
> To: "user@hive.apache.org" <user@hive.apache.org>; 'Bejoy KS'
> <bejoy_ks@yahoo.com>; user <user@hadoop.apache.org> 
> Sent: Friday, September 21, 2012 1:01 PM
> Subject: RE: Map issue in Hive.
> 
> 
> 
> Thanks Bejoy, So you mean to say in the below scenario we have to have
> both collection and map together? Do I need to define Array and MAP
> together for the same column? As I understand from your mail this
> column has not only MAP but collection of Maps. Is this assumption is
> right?
> 
> Thank You,
> Manish.
> 
> -----Original Message-----
> From: Bejoy KS [mailto:bejoy_ks@yahoo.com] 
> Sent: Friday, September 21, 2012 10:50 AM
> To: user@hive.apache.org; user
> Subject: Re: Map issue in Hive.
> 
> 
> 
> Hi Manish
> 
> Couple of things to keep in mind here
> 
> if you have a column data like this
> "key1:value1;key2:value2;key3:value3;" and this column has to be
> handled by a map data type, Then the DDL should like like
> FIELDS TERMINATED BY '<any char>' 
> COLLECTION ITEMS TERMINATED BY ';'
> MAP KEYS TERMINATED BY ',' 
> 
> ie when you have a key value pair, the separator for each key value
> pair is specified using 'COLLECTION ITEMS TERMINATED BY' and the
> separator for key and value within each pair is specified using 'MAP
> KEYS TERMINATED BY' .
> 
> In your column if it is just a collection of elements rather than a
> key value pair, you can use an Array data type instead. Here just
> specify the delimiter for each values using 'COLLECTION ITEMS
> TERMINATED BY'
> 
> 
> 
> Regards,
> Bejoy KS
> 
> 
> ________________________________
> From: Manish <manishbhoge@rocketmail.com>
> To: user <user@hadoop.apache.org> 
> Cc: user <user@hive.apache.org> 
> Sent: Friday, September 21, 2012 10:04 AM
> Subject: Map issue in Hive.
> 
> 
> Hivers,
> 
> I have a web log which i need to load into single table. But one
> column has complete string of important data. However i want to
> extract complete information from 1 column and do further analysis.
> 
> Issue here is that after giving ';' as a delimiter i was expecting Map
> for all occurrence of  ';'. But it is considering only first
> delimiter(;) and rest of the string is coming in value pair.
> 
> This is how 1 column data is looks like
> 
> ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%
> 7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;
> +LastLogin=9/11/2012+12:00:01+AM
> 
>     It is getting stored as below. 
> 
> {"ASP.NET_SessionId":"bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%
> 7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;
> +LastLogin=9/11/2012+12:00:01+AM"}
> 
> Below is the DDL. 
> 
> CREATE external TABLE page_view_tmp_2
> (
> C_0 STRING,
> C_1 MAP<STRING,STRING>,
> C_2 STRING,
> C_3 STRING,
> C_41 STRING)
> COMMENT 'Page View'
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' MAP KEYS TERMINATED BY
> ';' 
> STORED AS TEXTFILE           
> 
> 
> 


Mime
View raw message