hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jayanth Muthya <>
Subject Re: Concurrency in hive
Date Fri, 22 Jun 2012 09:14:04 GMT
Thanks or clarifying, I'll look into it too and see if I can find anything.


On Thu, Jun 21, 2012 at 10:47 PM, Jerome Banks <> wrote:

> set hive.exec.parallel=true;
> This will run Hive jobs in parallel, if they are able to do so.
> As for multi-threading in the actual job itself, I don't think so, but I'm
> not sure. The query planner will merge steps together, in order to try to
> minimize the number of MR jobs needed to run a query, but I think those are
> chained together in a single thread, both on the mapper and reduce.
> When I was at Quantcast, we had some multi-threading in the mapper ands
> reducers, to try to increase throughput, by utilizing the CPU when the job
> would otherwise be blocked on IO.  This helps out, if your IO is very slow,
> but if the IO no longer becomes a bottleneck, then you spend a lot of time
> context-switching, and it no longer efficient.
> Interesting question, I'll look into it some more. Let me know if you find
> out anything.
> -- jerome
> On Thu, Jun 21, 2012 at 1:16 AM, Jayanth Muthya <
> >wrote:
> > Hi,
> > I was looking into some of the source code for hive. And had a few
> > questions regarding parallelism in hive. Can a map task in
> > hive exploit parallelism and run multiple threads? If it can do that,
> does
> > it do it by default? or does a user have to configure the settings?
> > This question seems really basic, I just started looking into
> hadoop/hive.
> > Thanks in advance!
> >
> > -Jay
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message