hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manhee Jo ...@nttdocomo.com>
Subject Re: loading data from HDFS or local file to
Date Thu, 23 Jul 2009 02:12:03 GMT
Hi Zheng,

I've tried to load a sample file after creating an external table like 
below.

hive> create external table extab (key int, val string)
      > row format delimited fields terminated by '\t'
      > lines terminated by '\n'
      > location '/user/hive/warehouse/test/';

Here, /user/hive/warehouse/test contains an HDFS file which I am going to 
load
into table extab. this was OK. On load, though,

hive> load data inpath '/user/hive/warehouse/test/kv1.txt'
      > overwrite into table extab;

I found an error like below

FAILED: Error in semantic analysis: line 2:17 Path is not legal 
'/user/hive/warehouse/test/kv1.txt':
Move from: hdfs://vm2:9000/user/hive/warehouse/test/kv1.txt to: 
/user/hive/warehouse/test/ is not valid.
Please check that values for params "default.fs.name" and 
"hive.metastore.warehouse.dir" do not onflict.

I've changed directories different ones, but to no avail. Can you suggest 
any solutions?

By the way, is "default.fs.name" right? I could find "fs.default.name" but 
not "default.fs.name".

Thank you,
Manhee


----- Original Message ----- 
From: "Zheng Shao" <zshao9@gmail.com>
To: <hive-user@hadoop.apache.org>
Sent: Thursday, July 23, 2009 5:49 AM
Subject: Re: loading data from HDFS or local file to


If the huge file is already on HDFS (load data WITHOUT local), Hive
will just *move* the file into the table (NOTE: that means user won't
be able to see the file in its original directory afterwards)

If you don't want that to happen, you might want to use "CREATE
EXTERNAL TABLE .... LOCATION "/user/myname/myfiledir";"

If the huge file is on local file system, you will have to use (load
data WITH local), and Hive will copy the file.


Zheng

On Wed, Jul 22, 2009 at 12:25 AM, Manhee Jo<jo@nttdocomo.com> wrote:
> Hi all,
>
> What really happens when a huge file (e.g. some tens of TB) is "LOADed 
> DATA
> (LOCAL) INPATH ...
> INTO TABLE"? Does hive need to scan the entire file before processing
> anything even very simple (e.g. select)?
> If so, are there any solutions to decrease the number of disk access? Is
> partitioning a way to do it?
>
> Many Thanks,
> Manhee
>



-- 
Yours,
Zheng



Mime
View raw message