hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From LLBian <linanmengxia...@126.com>
Subject Re:Re: what is the difference between ³hive.compute.splits.in.am=true²and "hive.compute.splits.in.am=false"
Date Tue, 19 Jan 2016 04:45:44 GMT


Thank-you so much for your quick response. Yea, the option is use only for hive-on-tez.
I want to know its source, its principle.
Mybe this resource “http://www.slideshare.net/Hadoop_Summit/w-235phall1pandey/29” is very
useful, but I can not visit it in our country (mybe for political reasons). Can you please
tell me other explainations?

Thankyou & Rest Regards

---LLBian

At 2016-01-19 11:44:02, "Gopal Vijayaraghavan" <gopalv@apache.org> wrote:
>
>>what is the difference between³hive.compute.splits.in.am=true²and
>>"hive.compute.splits.in.am=false"?
>>which value is better?
>
>First up, those options are specific to Tez.
>
>The old MapReduce model was to always compute splits before asking for
>resources to run. And this uses the gateway host (where the CLI runs) to
>do that.
>
>That model runs sequentially and overload single gateway machines during
>heavy concurrency, particularly when used via ODBC (HiveServer2 mode).
>
>Here's an old slide explaining how that speeds up queries.
>
>http://www.slideshare.net/Hadoop_Summit/w-235phall1pandey/29
>
>
>This dynamic & pipelined model lays down the foundation for optimizations
>like Tez's dynamic partition pruning.
>
>Cheers,
>Gopal
>
>
Mime
View raw message