hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy KS <bejoy...@yahoo.com>
Subject Re: Map issue in Hive.
Date Fri, 21 Sep 2012 05:19:34 GMT


Hi Manish

Couple of things to keep in mind here

if you have a column data like this "key1:value1;key2:value2;key3:value3;" and this column
has to be handled by a map data type, Then the DDL should like like
FIELDS TERMINATED BY '<any char>' 
COLLECTION ITEMS TERMINATED BY ';'
MAP KEYS TERMINATED BY ',' 

ie when you have a key value pair, the separator for each key value pair is specified using
'COLLECTION ITEMS TERMINATED BY' and the separator for key and value within each pair is specified
using 'MAP KEYS TERMINATED BY' .

In your column if it is just a collection of elements rather than a key value pair, you can
use an Array data type instead. Here just specify the delimiter for each values using 'COLLECTION
ITEMS TERMINATED BY'



Regards,
Bejoy KS


________________________________
From: Manish <manishbhoge@rocketmail.com>
To: user <user@hadoop.apache.org> 
Cc: user <user@hive.apache.org> 
Sent: Friday, September 21, 2012 10:04 AM
Subject: Map issue in Hive.


Hivers,

I have a web log which i need to load into single table. But one column has complete string
of important data. However i want to extract complete information from 1 column and do further
analysis.

Issue here is that after giving ';' as a delimiter i was expecting Map for all occurrence
of  ';'. But it is considering only first delimiter(;) and rest of the string is coming in
value pair.

This is how 1 column data is looks like

ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM

    It is getting stored as below. 

{"ASP.NET_SessionId":"bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM"}

Below is the DDL. 

CREATE external TABLE page_view_tmp_2
(
C_0 STRING,
C_1 MAP<STRING,STRING>,
C_2 STRING,
C_3 STRING,
C_41 STRING)
COMMENT 'Page View'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' MAP KEYS TERMINATED BY ';' 
STORED AS TEXTFILE           

Mime
View raw message