hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristopher Glover <kglo...@appnexus.com>
Subject Re: threading with hive client
Date Thu, 15 Aug 2013 21:29:57 GMT
Thanks for all the great insight. I'll poke around a little more to see if I could at least
start documenting the changes required to make everything thread safe as well as remove the
synchronization.

@Xuefu-
I completely understand your points, I was just trying to figure out if there was a specific
functional reason for making them public when there was a known vulnerability. For instance,
 why not synchronize the compile method itself instead of relying on external synchronization.
From the sound of it there were no specific reasons, other then no one has gotten around to
making the improvements yet. Maybe it'll be something I can contribute back.

Thanks again,
Kris

Xuefu Zhang wrote:
To add,

1. Being public doesn't necessarily guarantee thread-safety. Of course,
this is no excuse for not documenting thread-safety.
2. Sometimes a method is made public for testing, which is bad in my
opnion, but I saw many instances like this before.

--Xuefu



On Thu, Aug 15, 2013 at 1:11 PM, Brock Noland <brock@cloudera.com> wrote:

> Well you would have probably found the areas we need to fix! :) The hive
> source is is not strict about methods and member visibility. The good news
> is that we have been making significant improvements in this aspect.
>
> Brock
>
>
> On Thu, Aug 15, 2013 at 2:55 PM, Kristopher Glover <kglover@appnexus.com
> >wrote:
>
> > Interesting, I didn't realize that. If that's the case then I suppose
> it'd
> > be really bad for me to circumvent the lock by reproducing the Driver#run
> > method by calling Driver#compile and Driver#execute directly from within
> > my app.
> >
> > If that is the case why make Driver#compile and Driver#execute public
> > methods? There doesn't seem to be any inheritance that requires them to
> be
> > public and the fact that they are public opens up a thread safety issue.
> >
> > Thanks,
> > Kris
> >
> > On 8/15/13 1:11 PM, "Brock Noland" <brock@cloudera.com> wrote:
> >
> > >The hive semantic analyzer is not fully thread safe.  We'd like to
> remove
> > >that lock but it will be a large project.
> > >
> > >Brock
> > >
> > >
> > >On Thu, Aug 15, 2013 at 11:12 AM, Kristopher Glover
> > ><kglover@appnexus.com>wrote:
> > >
> > >> Hi Everyone,
> > >>
> > >> I'm experiencing a threading issue with the Hive client where I want
> to
> > >> run multiple queries on the same JVM.
> > >>
> > >>  The problem I'm having is that org.apache.hadoop.hive.ql.Driver#run
> > >>(line
> > >> 907)  has the following few lines of code :
> > >>
> > >>  synchronized (compileMonitor) {
> > >>
> > >>       ret = compile(command);
> > >>
> > >>     }
> > >>
> > >>
> > >> The compileMonitor is a static so it blocks all threads even though
> I'm
> > >> using different instances of the Driver class. I could explicitly call
> > >> Driver#compile then Driver#execute to avoid the synchronized block
> but I
> > >> don't know if it's serving a special purpose. Does anyone know why
> that
> > >> synchronized block is there and if its really necessary ?
> > >>
> > >>
> > >> Thanks,
> > >>
> > >> Kris
> > >>
> > >
> > >
> > >
> > >--
> > >Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
> >
> >
>
>
> --
> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message