hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy KS <bejoy...@yahoo.com>
Subject Re: Map issue in Hive.
Date Fri, 21 Sep 2012 07:45:36 GMT
Hey Manish

Sorry If my post was not clear. You need to use either Array or Map for that based on the
data it holds. looking at your sample data 

ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM


I assume it need to be split like this, which is of the format key '=' value and key value
pairs are separated by ';' . 


ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;
+Rviewd=;
+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;
+UserType=G;
+LastLogin=9/11/2012+12:00:01+AM


So you can have Map as the column data type and DDL should be like  .


COLLECTION ITEMS TERMINATED BY ';'
MAP KEYS TERMINATED BY '='  


  Hope it is clear now :)

Regards,
Bejoy KS


________________________________
 From: Manish.Bhoge <Manish.Bhoge@target.com>
To: "user@hive.apache.org" <user@hive.apache.org>; 'Bejoy KS' <bejoy_ks@yahoo.com>;
user <user@hadoop.apache.org> 
Sent: Friday, September 21, 2012 1:01 PM
Subject: RE: Map issue in Hive.
 
Thanks Bejoy, So you mean to say in the below scenario we have to have both collection and
map together? Do I need to define Array and MAP together for the same column? As I understand
from your mail this column has not only MAP but collection of Maps. Is this assumption is
right?

Thank You,
Manish.

-----Original Message-----
From: Bejoy KS [mailto:bejoy_ks@yahoo.com] 
Sent: Friday, September 21, 2012 10:50 AM
To: user@hive.apache.org; user
Subject: Re: Map issue in Hive.



Hi Manish

Couple of things to keep in mind here

if you have a column data like this "key1:value1;key2:value2;key3:value3;" and this column
has to be handled by a map data type, Then the DDL should like like
FIELDS TERMINATED BY '<any char>' 
COLLECTION ITEMS TERMINATED BY ';'
MAP KEYS TERMINATED BY ',' 

ie when you have a key value pair, the separator for each key value pair is specified using
'COLLECTION ITEMS TERMINATED BY' and the separator for key and value within each pair is specified
using 'MAP KEYS TERMINATED BY' .

In your column if it is just a collection of elements rather than a key value pair, you can
use an Array data type instead. Here just specify the delimiter for each values using 'COLLECTION
ITEMS TERMINATED BY'



Regards,
Bejoy KS


________________________________
From: Manish <manishbhoge@rocketmail.com>
To: user <user@hadoop.apache.org> 
Cc: user <user@hive.apache.org> 
Sent: Friday, September 21, 2012 10:04 AM
Subject: Map issue in Hive.


Hivers,

I have a web log which i need to load into single table. But one column has complete string
of important data. However i want to extract complete information from 1 column and do further
analysis.

Issue here is that after giving ';' as a delimiter i was expecting Map for all occurrence
of  ';'. But it is considering only first delimiter(;) and rest of the string is coming in
value pair.

This is how 1 column data is looks like

ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM

    It is getting stored as below. 

{"ASP.NET_SessionId":"bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM"}

Below is the DDL. 

CREATE external TABLE page_view_tmp_2
(
C_0 STRING,
C_1 MAP<STRING,STRING>,
C_2 STRING,
C_3 STRING,
C_41 STRING)
COMMENT 'Page View'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' MAP KEYS TERMINATED BY ';' 
STORED AS TEXTFILE           
Mime
View raw message