hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Namit Jain <>
Subject RE: Load data from file header
Date Sat, 04 Sep 2010 15:21:37 GMT
Cant you parse the file to get 2 files ?

From: Sunil Subrahmanyam []
Sent: Saturday, September 04, 2010 6:59 AM
Subject: RE: Load data from file header

Thanks Namit for the response.

The format of the header line is different from the other lines. Is
there a way to retain the information from the first line (the store#
information), while I parse the remaining lines of data using regex
serde? To create table T1, I can use regex to get columns "c1" and "c2",
but how do I retain the store# information from the first line?

I can get c1 and c2 as below,

create table T1 (c1 string, c2 string) ROW FORMAT SERDE
'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES
("input.regex" = "(\\d{6}).{3}(\\d{6}).*") stored as textfile;


-----Original Message-----
From: Namit Jain []
Sent: Saturday, September 04, 2010 12:17 AM
Subject: RE: Load data from file header

create 2 tables T1 and T2.

T1 has the schema of the file - no partitioning column (say
T2 is partitioned on (store#) - and the schema is 1 less column (c1, c2
partitioned by store#)

load the data into T1


insert into T2 partition(store#) select c1,c2,store# from T1

From: Sunil Subrahmanyam []
Sent: Friday, September 03, 2010 8:11 PM
Subject: Load data from file header


My data files have a single line (first line) of header information
followed by many lines of actual data. I am able to load the data into
hive table using RegexSerDe. But I want to save the information in the
header with every data row or use it to partition the table. How do I do

Filename: Data.txt
Store#   Date
Data Line1
Data Line2
How do I save the store# with DataLine? Or use store# to partition


No virus found in this incoming message.
Checked by AVG -
Version: 9.0.851 / Virus Database: 271.1.1/3103 - Release Date: 09/03/10

View raw message