hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 周梦想 <abloz...@gmail.com>
Subject how to handle variable format data of text file?
Date Mon, 11 Mar 2013 04:04:53 GMT
I have files like this:
03/11/13 10:59:52 00000ec0 1009 180538126 92041 2300 0 0 7 21|47|20|33|11
0:2775
03/11/13 10:59:52 00000744 1010 178343610 92042 350 1 0 -1 NULL NULL 22 45
the format is separated by blank space:
date time threadid gid userid [variable formated data grouped by fields
separated by space ]

I'd like to create a table like:

hive> create external table handresult (hdate string,htime string, thid
string, gid int, userid string,ldata string) row format delimited fields
terminated by  " ";
OK

but the above table will only have a part of the data.
select * from handresult;
03/11/13 10:59:52 00000ec0 1009 180538126 92041
03/11/13 10:59:52 00000744 1010 178343610 92042

the remain data  like "2300 0 0 7 21|47|20|33|11 0:2775 "  I can't get.

while ldata may be variance length and format separated by " " or an array,
the ldata we will parse diferent  by each gid.

how do this?

Thanks,
Andy Zhou

Mime
View raw message