hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Kulp <dk...@fiksu.com>
Subject Re: Managed vs external tables in hive
Date Thu, 10 May 2012 22:02:15 GMT
It's simpler than this.  All files look the same -- and are often very simple delimited text
-- whether managed or external.  The only difference is that the files associated with a managed
table are dropped when the table is dropped and files that are loaded into a managed table
are moved into hive's private path.  External tables never move or remove files.  Performance
is the same.

On May 10, 2012, at 5:52 PM, kulkarni.swarnim@gmail.com wrote:

> I am pretty new to hive and was trying to clearly understand the difference between a
managed and an external table. 
> 
> As my current understanding stands, a managed table is a table whose data is completely
owned by hive whereas an external table is usually created to have a hive frontend for the
data managed in external systems.I would suppose this would mean that a query on an external
table goes out to fetch data from the given external table, deserialize according to the given/suitable
SerDe and then show the output of the query in hive format.
> 
> So does this mean that cost of using external tables is much higher than the native ones?
Or is there some caching that comes into play that I am not seeing right now.
> 
> Thanks for the help.
> 
> -- 
> Swarnim


Mime
View raw message