hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thiruvel Thirumoolan <thiru...@yahoo-inc.com>
Subject Re: Loading files into tables
Date Tue, 01 Feb 2011 20:47:50 GMT


Local tables are like hive tables in all other senses except that they are on the local disk
rather than HDFS. The only other difference I know of is that when you call "drop table" on
a local table, only the metadata on the table gets deleted. For tables on HDFS, the table
data gets deleted with the metadata.


Ajo,

Guess there is a confusion here. No concept of Local tables in Hive AFAIK. The behavior you
mention is for EXTERNAL tables. And the data for external tables can be on local file system
or HDFS, depending on configuration. The other tables are addressed as MANAGED tables for
which Hive creates a directory under warehouse dir.

-Ajo.

On Tue, Feb 1, 2011 at 8:41 AM, Amlan Mandal <amlan@fourint.com<mailto:amlan@fourint.com>>
wrote:
Thanks Ajo.
Please confirm if my understanding is correct.
That means when I do "LOAD DATA *LOCAL* INPATH 'filepath' [OVERWRITE] INTO TABLE tablename"
data in is local file system. If I need to run HIVE queries (which in turn would be converted
to Map Reduce jobs) I need to pull the data some other table for which data is in HDFS by
means of

INSERT OVERWRITE TABLE tablename_new SELECT *  FROM tablename ... (kind of)

So those LOCAL tables are kind of temporary.

See - http://wiki.apache.org/hadoop/Hive/LanguageManual/DML That should clarify load local.


Amlan


On Tue, Feb 1, 2011 at 6:51 PM, Ajo Fod <ajo.fod@gmail.com<mailto:ajo.fod@gmail.com>>
wrote:
>
> Look up for local :
> http://wiki.apache.org/hadoop/Hive/GettingStarted
>
> -Ajo.
>
> On Tue, Feb 1, 2011 at 3:15 AM, Amlan Mandal <amlan@fourint.com<mailto:amlan@fourint.com>>
wrote:
>>
>> LOAD DATA *LOCAL* INPATH 'filepath' [OVERWRITE] INTO TABLE tablename
>>
>> When I use LOCAL keyword does hive create a hdfs file for it?
>>

Yes. Hive creates a file for it on HDFS.

As Ping Zhu mentioned, do a 'describe formatted <tablename>' or 'describe extended <tablename>'
after loading data.  Check that location on HDFS.

You can also check the logs (they are usually at /tmp/<username>/hive.log). You can
see the local file getting copied to HDFS scratch directory and then being moved to a directory
under warehouse. If you find anything strange, can u please post them here?

Mime
View raw message