hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Yang <py...@facebook.com>
Subject RE: Managed Vs External tables
Date Wed, 21 Jul 2010 19:05:29 GMT
Yeah, the metastore still holds the definition for external tables. As you mentioned, for an
external table, hive doesn't delete the data when you drop the table nor renames the directory
when the table is renamed. Also, external tables can't be archived. In general, hive will
not do any operations that affect the underlying files if the tables is declared external.

From: Pradeep Kamath [mailto:pradeepk@yahoo-inc.com]
Sent: Wednesday, July 21, 2010 10:10 AM
To: hive-user@hadoop.apache.org
Subject: Managed Vs External tables

Hi,
  I am trying to understand the differences between managed Vs external tables. From http://wiki.apache.org/hadoop/Hive/StorageHandlers#Terminology:
"A managed table is one for which the definition is primarily managed in Hive's metastore,
and for whose data storage Hive is responsible. An external table is one whose definition
is managed in some external catalog, and whose data Hive does not own (i.e. it will not be
deleted when the table is dropped)."

I am a little confused by the "external table is one whose definition is managed in some external
catalog" - I thought the definition for external tables is still managed by the metastore
(and not an external catalog) no?

I thought the only difference between managed and external tables is that the data is not
dropped when you drop an external table - are there any other differences?

Thanks,
Pradeep

Mime
View raw message