hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tucker, Matt" <>
Subject Re: Executing multiple queries in parallel from the one .hql file
Date Wed, 20 Jun 2012 00:31:33 GMT

Statements in an query file are executed serially. When a query is parsed by Hive, independent
stages of the query are executed in parallel when you set the parallelization flag.

If the queries are completely independent of each other, it may be better to split them into
separate files and set multiple oozie actions.  If queries rely on prior query resultsets,
you're best off keeping them in a single file, or writing logic outside of hive to manage
order of execution.

On Jun 19, 2012, at 8:05 PM, "drichelson" <> wrote:

> I have multiple statements in a single .hql file that I am calling via an oozie action.
> Most of these statements can be executed in parallel (they do not depend on each other).
 I already have the parallel execution flag set to true (although I have yet to see multiple
Hive MR jobs running at once)
> Hive is running them all sequentially.
> Without breaking out each statement into its own Oozie action, I'd like to run most of
them in parallel.. any ideas?
> To be clear, I am not looking to increase the number of mappers/reducers for each task,
but to increase the number of map reduce jobs running at once as there are typically free
slots on the cluster not being used.
> This email and any files transmitted with it are confidential and intended solely for
the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the sender.
> Please note that any views or opinions presented in this email are solely those of the
author and do not necessarily represent those of the company.
> Finally, the recipient should check this email and any attachments for the presence of
> The company accepts no liability for any damage caused by any virus transmitted by this

View raw message