hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Kramer (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-549) UNION ALL statements should be run in parallel
Date Mon, 08 Jun 2009 07:50:07 GMT

     [ https://issues.apache.org/jira/browse/HIVE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adam Kramer updated HIVE-549:
-----------------------------

    Description: 
In a massively parallel database system, it would be awesome to also parallelize some of the
mapreduce phases that our data needs to go through.

One example that just occurred to me is UNION ALL: when you union two SELECT statements, effectively
you could run those statements in parallel. There's no situation (that I can think of, but
I don't have a formal proof) in which the left statement would rely on the right statement,
or vice versa. So, they could be run at the same time...and perhaps they should be. Or, perhaps
there should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?

  was:
In a massively parallel database system, it would be awesome to also parallelize some of the
mapreduce phases that our data needs to go through.

One example that just occurred to me is UNION ALL--when you union two SELECT statements, effectively
you could run those statements in parallel--there's no situation (that I can think of, but
I don't have a formal proof) in which the left statement would rely on the right statement,
or vice versa. So, they could be run at the same time...and perhaps they should be. Or, perhaps
there should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?


> UNION ALL statements should be run in parallel
> ----------------------------------------------
>
>                 Key: HIVE-549
>                 URL: https://issues.apache.org/jira/browse/HIVE-549
>             Project: Hadoop Hive
>          Issue Type: Wish
>          Components: Query Processor
>            Reporter: Adam Kramer
>
> In a massively parallel database system, it would be awesome to also parallelize some
of the mapreduce phases that our data needs to go through.
> One example that just occurred to me is UNION ALL: when you union two SELECT statements,
effectively you could run those statements in parallel. There's no situation (that I can think
of, but I don't have a formal proof) in which the left statement would rely on the right statement,
or vice versa. So, they could be run at the same time...and perhaps they should be. Or, perhaps
there should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message