hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brock Noland <br...@cloudera.com>
Subject Re: threading with hive client
Date Fri, 16 Aug 2013 14:40:08 GMT
https://issues.apache.org/jira/browse/HIVE-4239
https://issues.apache.org/jira/browse/HIVE-80

I would guess this comment on HIVE-80 is still applicable:

"there may still be some thread-unsafe code, but no one knows for sure.
Given that, the only approach may be to do as much review as possible (e.g.
grep for statics that shouldn't be there), ask everyone to add any known
issues here, and then set up a testbed and see what turns up."


On Fri, Aug 16, 2013 at 9:28 AM, Kristopher Glover <kglover@appnexus.com>wrote:

> One more question. If the SemanticAnalyzer isn't fully thread safe could
> you provide any pointers as to why it may not be thread safe? It's a 9000
> line file so any hints as to where to get started would be much
> appreciated. I don't see anything very obvious like globally shared member
> variables so I'm guessing it's more subtle then that.
>
> Thanks,
> Kris
>
> On 8/15/13 5:29 PM, "Kristopher Glover" <kglover@appnexus.com> wrote:
>
> >Thanks for all the great insight. I'll poke around a little more to see
> >if I could at least start documenting the changes required to make
> >everything thread safe as well as remove the synchronization.
> >
> >@Xuefu-
> >I completely understand your points, I was just trying to figure out if
> >there was a specific functional reason for making them public when there
> >was a known vulnerability. For instance,  why not synchronize the compile
> >method itself instead of relying on external synchronization. From the
> >sound of it there were no specific reasons, other then no one has gotten
> >around to making the improvements yet. Maybe it'll be something I can
> >contribute back.
> >
> >Thanks again,
> >Kris
> >
> >Xuefu Zhang wrote:
> >To add,
> >
> >1. Being public doesn't necessarily guarantee thread-safety. Of course,
> >this is no excuse for not documenting thread-safety.
> >2. Sometimes a method is made public for testing, which is bad in my
> >opnion, but I saw many instances like this before.
> >
> >--Xuefu
> >
> >
> >
> >On Thu, Aug 15, 2013 at 1:11 PM, Brock Noland <brock@cloudera.com> wrote:
> >
> >> Well you would have probably found the areas we need to fix! :) The hive
> >> source is is not strict about methods and member visibility. The good
> >>news
> >> is that we have been making significant improvements in this aspect.
> >>
> >> Brock
> >>
> >>
> >> On Thu, Aug 15, 2013 at 2:55 PM, Kristopher Glover <
> kglover@appnexus.com
> >> >wrote:
> >>
> >> > Interesting, I didn't realize that. If that's the case then I suppose
> >> it'd
> >> > be really bad for me to circumvent the lock by reproducing the
> >>Driver#run
> >> > method by calling Driver#compile and Driver#execute directly from
> >>within
> >> > my app.
> >> >
> >> > If that is the case why make Driver#compile and Driver#execute public
> >> > methods? There doesn't seem to be any inheritance that requires them
> >>to
> >> be
> >> > public and the fact that they are public opens up a thread safety
> >>issue.
> >> >
> >> > Thanks,
> >> > Kris
> >> >
> >> > On 8/15/13 1:11 PM, "Brock Noland" <brock@cloudera.com> wrote:
> >> >
> >> > >The hive semantic analyzer is not fully thread safe.  We'd like to
> >> remove
> >> > >that lock but it will be a large project.
> >> > >
> >> > >Brock
> >> > >
> >> > >
> >> > >On Thu, Aug 15, 2013 at 11:12 AM, Kristopher Glover
> >> > ><kglover@appnexus.com>wrote:
> >> > >
> >> > >> Hi Everyone,
> >> > >>
> >> > >> I'm experiencing a threading issue with the Hive client where
I
> >>want
> >> to
> >> > >> run multiple queries on the same JVM.
> >> > >>
> >> > >>  The problem I'm having is that
> >>org.apache.hadoop.hive.ql.Driver#run
> >> > >>(line
> >> > >> 907)  has the following few lines of code :
> >> > >>
> >> > >>  synchronized (compileMonitor) {
> >> > >>
> >> > >>       ret = compile(command);
> >> > >>
> >> > >>     }
> >> > >>
> >> > >>
> >> > >> The compileMonitor is a static so it blocks all threads even though
> >> I'm
> >> > >> using different instances of the Driver class. I could explicitly
> >>call
> >> > >> Driver#compile then Driver#execute to avoid the synchronized block
> >> but I
> >> > >> don't know if it's serving a special purpose. Does anyone know
why
> >> that
> >> > >> synchronized block is there and if its really necessary ?
> >> > >>
> >> > >>
> >> > >> Thanks,
> >> > >>
> >> > >> Kris
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > >--
> >> > >Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
> >> >
> >> >
> >>
> >>
> >> --
> >> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
> >>
>
>


-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message