hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Artem Ervits <are9...@nyp.org>
Subject RE: load data stored as sequencefiles
Date Tue, 24 Sep 2013 16:59:15 GMT
I realize that I am using a part file, as far as loading using sqoop, I'm aware that works
but we originally decided to load using sqoop and leaving in hdfs, i.e. w/out -hive-table
flag. So my real question is since we made the decision to first load into hdfs using sequencefile,
is there a way to take those sequencefiles and load them into a hive table?

Thanks.

From: Nitin Pawar [mailto:nitinpawar432@gmail.com]
Sent: Tuesday, September 24, 2013 12:00 PM
To: user@hive.apache.org
Subject: Re: load data stored as sequencefiles

If you look at your load command,

LOAD DATA INPATH '/TEST/SeqFiles/201308300700/part-m-00001' INTO TABLE tblname;
you are loading a part file which does not look correct.

Secondly,
Why can't you just import using sqoop. Why you have to do load data?
If you are importing to hdfs using sqoop, and then loading data into hive table, then you
may want to give complete file name instead of part file in load command

On Tue, Sep 24, 2013 at 7:51 PM, Artem Ervits <are9004@nyp.org<mailto:are9004@nyp.org>>
wrote:
Anyone?

From: Artem Ervits [mailto:are9004@nyp.org<mailto:are9004@nyp.org>]
Sent: Friday, September 20, 2013 11:18 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: load data stored as sequencefiles

Hello all,

I'm a bit lost with using Hive and SequenceFiles. I loaded data using Sqoop from a RDBMS and
stored as sequencefile. I jarred the class generated by sqoop and added it to my create table
script. Now I create a table in hive and specify "STORED AS SEQUENCEFILE", I also "ADD JAR
SQOOP_GENERATED.JAR". Then I try to insert data with the same generated jar added. I also
specify

SET hive.exec.compress.output=true;
SET io.seqfile.compression.type=BLOCK;

LOAD DATA INPATH '/TEST/SeqFiles/201308300700/part-m-00001' INTO TABLE tblname;

When the query executes, I see this "[num_partitions: 0, num_files: 2, num_rows: 0, total_size:
478662618, raw_data_size: 0]"

When I select on the table,  I get org.apache.hadoop.hive.serde2.SerDeException: class org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe:
expects either BytesWritable or Text object!

So my question is, how do I specify my generated class along with SequenceFileInputFormat
in my create statement? How do I specify the inputformats?

This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged. If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited. If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message. Thank you.
________________________________

Confidential Information subject to NYP's (and its affiliates') information management and
security policies (http://infonet.nyp.org/QA/HospitalManual).

This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged. If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited. If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message. Thank you.



--
Nitin Pawar

This electronic message is intended to be for the use only of the named recipient, and may
contain information that is confidential or privileged.  If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution or use of the contents
of this message is strictly prohibited.  If you have received this message in error or are
not the named recipient, please notify us immediately by contacting the sender at the electronic
mail address noted above, and delete and destroy all copies of this message.  Thank you.
Mime
View raw message