hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <>
Subject Re: Hive query on Tez slower than on MR (fails in some cases) ..
Date Fri, 19 Feb 2016 19:39:26 GMT

> Here's the Tez DAG swimlane. Haven't gotten to work.. will
>send that too soon.

Pretty clear that the map-side is fine - splitting sort buffers isn't
bothering this at all.

We want to over-partition Reducer 7 and possibly have all of them pick the
total # of reducers dynamically

set hive.exec.parallel=false; -- bad idea on Tez

set; -- decide on total # of
reducers dynamically
set hive.tez.min.partition.factor=0.1;

set hive.tez.max.partition.factor=10;

set tez.shuffle-vertex-manager.min-src-fraction=0.9; -- slow start min
(reducer counts are picked at this point)
set tez.shuffle-vertex-manager.max-src-fraction=0.99;


(experimental!! - I'm still testing this for machine failure tolerance)

set tez.runtime.pipelined-shuffle.enabled=true;


View raw message