hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thejas Nair <>
Subject [DISCUSS] Re: deprecating MR in the first release of Hive 2.0
Date Thu, 22 Oct 2015 21:38:25 GMT
(Adding [DISCUSS] to subject to bring it to attention of wider audience.)

+1 Given how much investment is going into Tez and Spark execution
modes, it makes sense to convey that better to the user community and
recommend the use of the new modes over MR. Users who choose those
modes are going to get better experience, and it will help to improve
the overall perception of Hive.

Once most users have moved to the new modes, we can start looking into
removing MR support. (Though that is likely to take a while).

On Wed, Oct 21, 2015 at 9:44 PM, Sergey Shelukhin
<> wrote:
> We have discussed the removal of hadoop-1 and MR support in Hive 2 line in the past..
> Hadoop-1 removal seems to be non-controversial and on track; before we cut the first
release of Hive 2, I propose we deprecate MR.
> Tez and Spark engines provide vast perf improvements over MR;
> Execution optimization work by most contributors for a long time has been done for these
engines and is not portable to MR, so it is languishing further;
> At the same time, supporting additional code has other development costs for new features
or bugs, plus we have to run tests for it both in Apache and for local changes and to deploy
> However, MR is hard to remove. Plus, it may provide a baseline for some bugs in other
engines (which is not bulletproof since MR logic can be incorrect), or to mock during perf
> Therefore, I propose that for now we add deprecation warnings suggesting the other alternatives:
>   *   to Hive configuration documentation.
>   *   to Hive wiki.
>   *   to release notes on Hive 2.
>   *   in Beeline and CLI when using MR.
> Additionally, I propose we remove Minimr test driver from HiveQA runs for master.
> What do you think?

View raw message