hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chaitanya Mishra (JIRA)" <>
Subject [jira] Commented: (HIVE-549) Parallel Execution Mechanism
Date Mon, 23 Nov 2009 21:55:40 GMT


Chaitanya Mishra commented on HIVE-549:

The patch fails the following unit tests

TestCliDriver: input41.q, input42.q and input_part9.q

For input41.q the query involves a union-all, and the failure is because the threads can execute
either part of the union-all data as they see fit.

Other similar queries are: input25.q, input26.q, nullgroup5.q ,semijoin.q and union_script.q.
We need to rewrite these test cases.

For input42.q, and input_part9.q the problem is that the base table has 2 partitions, and
Hive can technically read the partitions in any order it sees fit. 

In fact I checked out the latest version of Hive, and ran the unit test for input_part9.q,
and it failed, because the data was generated in the opposite order. I think these two tests
should be deprecated.

> Parallel Execution Mechanism
> ----------------------------
>                 Key: HIVE-549
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: Wish
>          Components: Query Processor
>            Reporter: Adam Kramer
>            Assignee: Chaitanya Mishra
>         Attachments: HIVE-549-v4.patch, HIVE-549-v5.patch
> In a massively parallel database system, it would be awesome to also parallelize some
of the mapreduce phases that our data needs to go through.
> One example that just occurred to me is UNION ALL: when you union two SELECT statements,
effectively you could run those statements in parallel. There's no situation (that I can think
of, but I don't have a formal proof) in which the left statement would rely on the right statement,
or vice versa. So, they could be run at the same time...and perhaps they should be. Or, perhaps
there should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message