hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lianhui Wang <lianhuiwan...@gmail.com>
Subject Re: Hive insert overwrite strange behavior
Date Mon, 21 Jul 2014 05:03:37 GMT
the operator plan of two sql is different.first one:
TableScanOperator--SelectOperator--ReduceOutputOperator--FileSinkOperator--MoveOperator
second one:TableScanOperator--SelectOperator--FetchOperator
in second one,FetchOperator work on client and directly output to local
directory.
but first one, result sink to tmp hdfs and then move tmp hdfs to local
directory.
you can add explain to to sql and then look at operator plan of sql.
example:
explain insert overwrite local directory 'output' select * from test limit
10;


2014-07-16 11:36 GMT+08:00 Azuryy Yu <azuryyyu@gmail.com>:

> Hi,
>
> I think the following two sql have the same effect.
>
> 1) hive -e "insert overwrite local directory 'output' select * from test
> limit 10;"
> 2) hive -e "select * from test limit 10;" > output
>
>
> but the second one read HDFS directly only takes two seconds, but the first
> one submit a MR job, which has one reduce.
>
> why there is such difference? Thanks.
>



-- 
thanks

王联辉(Lianhui Wang)
blog; http://blog.csdn.net/lance_123
兴趣方向:数据库,分布式,数据挖掘,编程语言,互联网技术等

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message