hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Swagatika Tripathy <swagatikat...@gmail.com>
Subject Re: Executing Hive Queries in Parallel
Date Sun, 27 Apr 2014 20:58:39 GMT
Hi,
You can also use oozie's fork fearure  which acts as a workflow scheduler
to run jobs in parallel. You just need to define all our hql's inside the
workflow.XML to make it run in parallel.
On Apr 22, 2014 3:14 AM, "Subramanian, Sanjay (HQP)" <
sanjay.subramanian@roberthalf.com> wrote:

>   Hey
>
>  Instead of going into HIVE CLI
>  I would propose 2 ways
>
>  *NOHUP *
>  nohup hive -f path/to/query/file/*hive1.hql* >> ./hive1.hql_`date
> +%Y-%m-%d-%H–%M–%S`.log 2>&1
>  nohup hive -f path/to/query/file/*hive2.hql* >> ./hive2.hql_`date
> +%Y-%m-%d-%H–%M–%S`.log 2>&1
>  nohup hive -f path/to/query/file/*hive3.hql* >> ./hive3.hql_`date
> +%Y-%m-%d-%H–%M–%S`.log 2>&1
>  nohup hive -f path/to/query/file/*hive4.hql* >> ./hive4.hql_`date
> +%Y-%m-%d-%H–%M–%S`.log 2>&1
>  nohup hive -f path/to/query/file/*hive5.hql* >> ./hive5.hql_`date
> +%Y-%m-%d-%H–%M–%S`.log 2>&1
>
>  Each statement above will launch MR jobs on your cluster and depending
> on the cluster configs the jobs will run parallelly
>  Scheduling jobs on the MR cluster is independent of Hive
>
>  *SCREEN sessions*
>
>    - Create a Screen session
>       - screen  –S  hive_query1
>       - U r inside the screen session hive_query1
>          - hive -f path/to/query/file/*hive1.hql*
>       - Ctrl A D
>          - U detach from a screen session
>        - Repeat for each hive query u want to run
>       - I.e. Say 5 screen sessions, each running a have query
>    - To display screen session active
>       - screen -x
>    - To attach to a screen session
>       - screen  -x hive_query1
>
>
>  Thanks
>
> Warm Regards
>
>
>  Sanjay
>
>
>    From: saurabh <mpp.databases@gmail.com>
> Reply-To: "user@hive.apache.org" <user@hive.apache.org>
> Date: Monday, April 21, 2014 at 1:53 PM
> To: "user@hive.apache.org" <user@hive.apache.org>
> Subject: Executing Hive Queries in Parallel
>
>
>  Hi,
>  I need some inputs to execute hive queries in parallel. I tried doing
> this using CLI (by opening multiple ssh connection) and executed 4 HQL's;
> it was observed that the queries are getting executed sequentially. All the
> FOUR queries got submitted however while the first one was in execution
> mode the other were in pending state. I was performing this activity on the
> EMR running on Batch mode hence didn't able to dig into the logs.
>
>  The hive CLI uses native hive connection which by default uses the FIFO
> scheduler.  This might be one of the reason for the queries getting
> executed in sequence.
>
>  I also observed that when multiple queries are executed using multiple
> HUE sessions, it provides the parallel execution functionality. Can you
> please suggest how the functionality of HUE can be replicated using CLI?
>
>  I am aware of beeswax client however i am not sure how this can be used
> during EMR- batch mode processing.
>
>  Thanks in advance for going through this. Kindly let me know your
> thoughts on the same.
>
>

Mime
View raw message