hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <>
Subject Re: Tez branch and tez based patches
Date Mon, 05 Aug 2013 17:54:25 GMT

On Jul 29, 2013, at 9:53 PM, Edward Capriolo wrote:

> Also watched
> I definitely see the win in being able to stream inter-stage output.
> I see some cases where small intermediate results can be kept "In memory".
> But I was somewhat under the impression that the map reduce spill settings
> kept stuff in memory, isn't that what spill settings are?

No.  MapReduce always writes shuffle data to local disk.  And intermediate results between
MR jobs are always persisted to HDFS, as there's no other option.  When we talk of being able
to keep intermediate results in memory we mean getting rid of both of these disk writes/reads
when appropriate (meaning not always, there's a trade off between speed and error handling
to be made here, see below for more details).

> There is a few bullet points that came up repeatedly that I do not follow:
> Something was said to the effect of "Container reuse makes X faster".
> Hadoop has jvm reuse. Not following what the difference is here? Not
> everyone has a 10K node cluster.

Sharing JVMs across users is inherently insecure (we can't guarantee what code the first user
left behind that may interfere with later users).  As I understand container re-use in Tez
it constrains the re-use to one user for security reasons, but still avoids additional JVM
start up costs.  But this is a question that the Tez guys could answer better on the Tez lists

> "Joins in map reduce are hard" Really? I mean some of them are I guess, but
> the typical join is very easy. Just shuffle by the join key. There was not
> really enough low level details here saying why joins are better in tez.

Join is not a natural operation in MapReduce.  MR gives you one input and one output.  You
end up having to bend the rules to do have multiple inputs.  The idea here is that Tez can
provide operators that naturally work with joins and other operations that don't fit the one
input/one output model (eg unions, etc.).

> "Chosing the number of maps and reduces is hard" Really? I do not find it
> that hard, I think there are times when it's not perfect but I do not find
> it hard. The talk did not really offer anything here technical on how tez
> makes this better other then it could make it better.

Perhaps manual would be a better term here than hard.  In our experience it takes quite a
bit of engineer trial and error to determine the optimal numbers.  This may be ok if you're
going to invest the time once and then run the same query every day for 6 months.  But obviously
it doesn't work for the ad hoc case.  Even in the batch case it's not optimal because every
once and a while an engineer has to go back and re-optimize the query to deal with changing
data sizes, data characteristics, etc.  We want the optimizer to handle this without human

> The presentations mentioned streaming data, how do two nodes stream data
> between a tasks and how it it reliable? If the sender or receiver dies does
> the entire process have to start again?

If the sender or receiver dies then the query has to be restarted from some previous point
where data was persisted to disk.  The idea here is that speed vs error recovery trade offs
should be made by the optimizer.  If the optimizer estimates that a query will complete in
5 seconds it can stream everything and if a node fails it just re-runs the whole query.  If
it estimates that a particular phase of a query will run for an hour it can choose to persist
the results to HDFS so that in the event of a failure downstream the long phase need not be
re-run.  Again we want this to be done automatically by the system so the user doesn't need
to control this level of detail.

> Again one of the talks implied there is a prototype out there that launches
> hive jobs into tez. I would like to see that, it might answer more
> questions then a power point, and I could profile some common queries.

As mentioned in a previous email afaik Gunther's pushed all these changes to the Tez branch
in Hive.


> Random late night thoughts over,
> Ed
> On Tue, Jul 30, 2013 at 12:02 AM, Edward Capriolo <>wrote:
>> At ~25:00
>> "There is a working prototype of hive which is using tez as the targeted
>> runtime"
>> Can I get a look at that code? Is it on github?
>> Edward
>> On Wed, Jul 17, 2013 at 3:35 PM, Alan Gates <> wrote:
>>> Answers to some of your questions inlined.
>>> Alan.
>>> On Jul 16, 2013, at 10:20 PM, Edward Capriolo wrote:
>>>> There are some points I want to bring up. First, I am on the PMC. Here
>>> is
>>>> something I find relevant:
>>>> ------------------------------
>>>> The role of the PMC from a Foundation perspective is oversight. The main
>>>> role of the PMC is not code and not coding - but to ensure that all
>>> legal
>>>> issues are addressed, that procedure is followed, and that each and
>>> every
>>>> release is the product of the community as a whole. That is key to our
>>>> litigation protection mechanisms.
>>>> Secondly the role of the PMC is to further the long term development and
>>>> health of the community as a whole, and to ensure that balanced and wide
>>>> scale peer review and collaboration does happen. Within the ASF we worry
>>>> about any community which centers around a few individuals who are
>>> working
>>>> virtually uncontested. We believe that this is detrimental to quality,
>>>> stability, and robustness of both code and long term social structures.
>>>> --------------------------------
>>>> -------------------------------------
>>>> All other decisions happen on the dev list, discussions on the private
>>> list
>>>> are kept to a minimum.
>>>> "If it didn't happen on the dev list, it didn't happen" - which leads
>>> to:
>>>> a) Elections of committers and PMC members are published on the dev list
>>>> once finalized.
>>>> b) Out-of-band discussions (IRC etc.) are summarized on the dev list as
>>>> soon as they have impact on the project, code or community.
>>>> ---------------------------------
>>>> ironically titled "Let
>>>> their be Tez" has not be +1 ed by any committer. It was never discussed
>>> on
>>>> the dev or the user list (as far as I can tell).
>>> As all JIRA creations and updates are sent to dev@hive, creating a JIRA
>>> is de facto posting to the list.
>>>> As a PMC member I feel we need more discussion on Tez on the dev list
>>> along
>>>> with a wiki-fied design document. Topics of discussion should include:
>>> I talked with Gunther and he's working on posting a design doc on the
>>> wiki.  He has a PDF on the JIRA but he doesn't have write permissions yet
>>> on the wiki.
>>>> 1) What is tez?
>>> In Hadoop 2.0, YARN opens up the ability to have multiple execution
>>> frameworks in Hadoop.  Hadoop apps are no longer tied to MapReduce as the
>>> only execution option.  Tez is an effort to build an execution engine that
>>> is optimized for relational data processing, such as Hive and Pig.
>>> The biggest change here is to move away from only Map and Reduce as
>>> processing options and to allow alternate combinations of processing, such
>>> as map -> reduce -> reduce or tasks that take multiple inputs or shuffles
>>> that avoid sorting when it isn't needed.
>>> For a good intro to Tez, see Arun's presentation on it at the recent
>>> Hadoop summit (video slides
>>>> 2) How is tez different from oozie,,
>>>> , and other DAG and or streaming
>>> map
>>>> reduce tools/frameworks? Why should we use this and not those?
>>> Oozie is a completely different thing.  Oozie is a workflow engine and a
>>> scheduler.  It's core competencies are the ability to coordinate workflows
>>> of disparate job types (MR, Pig, Hive, etc.) and to schedule them.  It is
>>> not intended as an execution engine for apps such as Pig and Hive.
>>> I am not familiar with these other engines, but the short answer is that
>>> Tez is built to work on YARN, which works well for Hive since it is tied to
>>> Hadoop.
>>>> 3) When can we expect the first tez release?
>>> I don't know, but I hope sometime this fall.
>>>> 4) How much effort is involved in integrating hive and tez?
>>> Covered in the design doc.
>>>> 5) Who is ready to commit to this effort?
>>> I'll let people speak for themselves on that one.
>>>> 6) can we expect this work to be done in one hive release?
>>> Unlikely.  Initial integration will be done in one release, but as Tez is
>>> a new project I expect it will be adding features in the future that Hive
>>> will want to take advantage of.
>>>> In my opinion we should not start any work on this tez-hive until these
>>>> questions are answered to the satisfaction of the hive developers.
>>> Can we change this to "not commit patches"?  We can't tell willing people
>>> not to work on it.
>>>> On Mon, Jul 15, 2013 at 9:51 PM, Edward Capriolo <
>>>> wrote:
>>>>>>> The Hive bylaws,
>>>>> , lay out what
>>>>> votes are needed for what.  I don't see anything there about needing
>>> +1s
>>>>> for a branch.  Branching >>would seem to fall under code change,
>>>>> requires one vote and a minimum length of 1 day.
>>>>> You could argue that all you need is one +1 to create a branch, but
>>> this
>>>>> is more then a branch. If you are talking about something that is:
>>>>> 1) going to cause major re-factoring of critical pieces of hive like
>>>>> ExecDriver and MapRedTask
>>>>> 2) going to be very disruptive to the efforts of other committers
>>>>> 3) something that may be a major architectural change
>>>>> Getting the project on board with the idea is a good idea.
>>>>> Now I want to point something out. Here are some recent initiatives in
>>>>> hive:
>>>>> 1) At one point there was a big initiative to "support oracle" after
>>> the
>>>>> initial work, there are patches in Jira no one seems to care about
>>> oracle
>>>>> support.
>>>>> 2) Another such decisions was this "support windows" one, there are
>>>>> probably 4 windows patches waiting reviews.
>>>>> 3) I still have no clue what the official hadoop1 hadoop2, hadoop 0.23
>>>>> support prospective is, but every couple weeks we get another jira
>>> about
>>>>> something not working/testing on one of those versions, seems like
>>> several
>>>>> builds are broken.
>>>>> 4) Hive-storage handler, after the initial implementation no one cares
>>> to
>>>>> review any other storage handler implementation, 3 patches there or
>>> more,
>>>>> could not even find anyone willing to review the cassandra storage
>>> handler
>>>>> I spent months on.
>>>>> 5) OCR, Vectorization
>>>>> 6) Windowing: committed, numerous check-style violations.
>>>>> We have !!!160+!!! PATCH_AVAILABLE Jira issues. Few active committers.
>>> We
>>>>> are spread very thin, and embarking on another side project not
>>> involved
>>>>> with core hive seems like the wrong direction at the moment.
>>>>> On Mon, Jul 15, 2013 at 8:37 PM, Alan Gates <>
>>> wrote:
>>>>>> On Jul 13, 2013, at 9:48 AM, Edward Capriolo wrote:
>>>>>>> I have started to see several re factoring patches around tez.
>>>>>>> This is the only mention on the hive list I can find with tez:
>>>>>>> "Makes sense. I will create the branch soon.
>>>>>>> Thanks,
>>>>>>> Ashutosh
>>>>>>> On Tue, Jun 11, 2013 at 7:44 PM, Gunther Hagleitner <
>>>>>>>> wrote:
>>>>>>>> Hi,
>>>>>>>> I am starting to work on integrating Tez into Hive (see HIVE-4660,
>>>>>> design
>>>>>>>> doc has already been uploaded - any feedback will be much
>>> appreciated).
>>>>>>>> This will be a fair amount of work that will take time to
>>>>>> stabilize/test.
>>>>>>>> I'd like to propose creating a branch in order to be able
to do this
>>>>>>>> incrementally and collaboratively. In order to progress rapidly
>>>>>> this,
>>>>>>>> I would also like to go "commit-then-review".
>>>>>>>> Thanks,
>>>>>>>> Gunther.
>>>>>>>> "
>>>>>>> These refactor-ings are largely destructive to a number of bugs
>>>>>>> language improvements in hive.The language improvements and bug
>>>>>> that
>>>>>>> have been sitting in Jira for quite some time now marked
>>> patch-available
>>>>>>> and are waiting for review.
>>>>>>> There are a few things I want to point out:
>>>>>>> 1) Normally we create design docs in out wiki (which it is not)
>>>>>>> 2) Normally when the change is significantly complex we get multiple
>>>>>>> committers to comment on it (which we did not)
>>>>>>> On point 2 no one -1  the branch, but this is really something
>>>>>> should
>>>>>>> have required a +1 from 3 committers.
>>>>>> The Hive bylaws,
>>>, lay out what
>>> votes are needed for what.  I don't see anything there about
>>>>>> needing 3 +1s for a branch.  Branching would seem to fall under code
>>>>>> change, which requires one vote and a minimum length of 1 day.
>>>>>>> I for one am not completely sold on Tez.
>>>>>>> "directed-acyclic-graph of tasks for processing data" this
>>> description
>>>>>>> sounds like many things which have never become popular. One
to think
>>>>>> of is
>>>>>>> oozie "Oozie Workflow jobs are Directed Acyclical Graphs (DAGs)
>>>>>>> actions.". I am sure I can find a number of libraries/frameworks
>>>>>> make
>>>>>>> this same claim. In general I do not feel like we have done our
>>> homework
>>>>>>> and pre-requisites to justify all this work. If we have done
>>>>>> homework,
>>>>>>> I am sure that it has not been communicated and accepted by hive
>>>>>> developers
>>>>>>> at large.
>>>>>> A request for better documentation on Tez and a project road map
>>>>>> totally reasonable.
>>>>>>> If we have a branch, why are we also committing on trunk? Scanning
>>>>>> through
>>>>>>> the tez doc the only language I keep finding language like "minimal
>>>>>> changes
>>>>>>> to the planner" yet, there is ALREADY lots of large changes going
>>>>>>> Really none of the above would bother me accept for the fact
>>> these
>>>>>>> "minimal changes" are causing many "patch available" ready-for-review
>>>>>> bugs
>>>>>>> and core hive features to need to be re based.
>>>>>>> I am sure I have mentioned this before, but I have to spend 12+
>>> hours to
>>>>>>> test a single patch on my laptop. A few days ago I was testing
a new
>>>>>> core
>>>>>>> hive feature. After all the tests passed and before I was able
>>>>>> commit,
>>>>>>> someone unleashed a tez patch on trunk which caused the thing
I was
>>>>>> testing
>>>>>>> for 12 hours to need to be rebased.
>>>>>>> I'm not cool with this.Next time that happens to me I will seriously
>>>>>>> consider reverting the patch. Bug fixes and new hive features
>>> more
>>>>>>> important to me then integrating with incubator projects.
>>>>>> (With my Apache member hat on)  Reverting patches that aren't breaking
>>>>>> the build is considered very bad form in Apache.  It does make sense
>>> to
>>>>>> request that when people are going to commit a patch that will break
>>> many
>>>>>> other patches they first give a few hours of notice so people can
>>>>>> something if they're about to commit another patch and avoid your
>>> fate of
>>>>>> needing to rerun the tests.  The other thing is we need to get get
>>>>>> automated build of patches working on Hive so committers are forced
>>> to run
>>>>>> all of the tests themselves.  We are working on it, but we're not
>>> there yet.
>>>>>> Alan.

View raw message