hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raj Hadoop <hadoop...@yahoo.com>
Subject Re: Loading a flat file + one additional field to a Hive table
Date Fri, 05 Jul 2013 22:27:26 GMT
Thanks Sanjay. I will look into this.

Also - one more question.

When I am trying to load log file to Hive and comparing the counts like this

select count(*) from <<Table>>

Versus

wc -l <<File>>

I see a few hundred records greater in <<Table>>. How should I debug it? Any tips
please.


________________________________
 From: Sanjay Subramanian <Sanjay.Subramanian@wizecommerce.com>
To: "user@hive.apache.org" <user@hive.apache.org>; Raj Hadoop <hadoopraj@yahoo.com>

Sent: Saturday, July 6, 2013 4:32 AM
Subject: Re: Loading a flat file + one additional field to a Hive table
 


How about this ?

Assume you have a log file called 
oompaloompa.log

TIMESTAMP=$(date +%Y_%m_%d_T%H_%M_%S);mv oompaloopa.log oompaloopa.log.${TIMESTAMP};cat oompaloopa.log.${TIMESTAMP}|
hdfs dfs -put - /user/sasubramanian/oompaloopa.log.${TIMESTAMP}

This will directly put the file on HDFS and u can put it to the LOCATION specified by your
HIVE TABLE definition

sanjay
 
From: "manishbhoge@rocketmail.com" <manishbhoge@rocketmail.com>
Reply-To: "user@hive.apache.org" <user@hive.apache.org>
Date: Friday, July 5, 2013 10:39 AM
To: Raj Hadoop <hadoopraj@yahoo.com>, Hive <user@hive.apache.org>
Subject: Re: Loading a flat file + one additional field to a Hive table


Raj,

You should dump the data in a temp table first and then move the data into final table with
select query.
Select date(), c1,c2..... From temp table.
Reason: we should avoid custom operation in load unless it is necessary.


Sent via Rocket from my HTC 

----- Reply message -----
From: "Raj Hadoop" <hadoopraj@yahoo.com>
To: "Hive" <user@hive.apache.org>
Subject: Loading a flat file + one additional field to a Hive table
Date: Fri, Jul 5, 2013 10:30 PM


Hi,
 
Can any one please suggest the best way to do the following in Hive?
 
Load 'todays date stamp' + << ALL FIELDS C1,C2,C3,C4 IN A FILE F1 >> to a Hive
table  T1 ( D1,C1,C2,C3,C4) 
 
    Can the following command be modified in some way to acheive the above
    hive > load data local inpath '/software/home/hadoop/dat_files/' into table T1;

 
My requirement is to append a date stamp to a Web log file and then load it to Hive table.
 
Thanks,
Raj 

CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s)
and may contain confidential and privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient,
 please contact the sender by reply email and destroy all copies of the original message along
with any attachments, from your computer system. If you are the intended recipient, please
be advised that the content of this message is subject to access, review
 and disclosure by the sender's Email System Administrator.
Mime
View raw message