hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Loren Siebert <lo...@siebert.org>
Subject Re: Load gzip files into hive
Date Thu, 28 Apr 2011 04:52:42 GMT
You have the file type as sequence file, but you are trying to load a GZip file. Won’t that
only work if the table is defined as a text file? 

Hive isn’t doing anything on your behalf when you do LOAD DATA. It’s syntactic sugar for
copying a file into a HDFS location. From there, if you want a RCFile table or a sequence
file table or whatever, you can select from the raw_logs table into the new table (e.g., raw_logs_rcfile)
that you have defined in the different format.


On Apr 27, 2011, at 9:33 PM, wd wrote:

> hi,
> 
> I've tried to load gzip files into hive to save disk space, but failed.
> 
> hive> load data local inpath 'tmp_b.20110426.gz' into table raw_logs partition ( dt=20110426
);
> Copying data from file:/home/wd/t/tmp_b.20110426.gz
> Copying file: file:/home/wd/t/tmp_b.20110426.gz
> Loading data to table default.raw_logs partition (dt=20110426)
> Failed with exception Wrong file format. Please check the file's format.
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
> 
> The raw_logs table is created by:
> create table raw_logs ( ............)  partitioned by ( dt int ) STORED AS SEQUENCEFILE;
> 
> Is there something wrong? The error is same both in hive 0.5 and 0.7.


Mime
View raw message