hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Esteban Gutierrez <este...@cloudera.com>
Subject Re: HBase client hangs after client-side OOM
Date Thu, 14 Aug 2014 22:30:39 GMT
Hello Ted,

ZooKeeper 3.4.5 is the recommended release to use in HBase 0.94.x,
regarding compatibility across ZooKeeper releases I don't think there is
any issue, but the ZK devs might be able to confirm.

cheers,
esteban.


--
Cloudera, Inc.



On Thu, Aug 14, 2014 at 3:19 PM, Ted Tuttle <ted@mentacapital.com> wrote:

> Hello All-
>
> It sounds like upgrading our zookeeper client would be a good idea. Can
> anyone provide some guidelines on compatibility of HBase 0.94.16 with ZK
> 3.4.X? How about compatibility of ZK client 3.4.X w/ ZK server 3.3.4?  I've
> read a few contradictory things about ZK client/server compatibility across
> 3.3/3.4 releases.
>
> Thanks,
> Ted
>
> -----Original Message-----
> From: Ted Tuttle [mailto:ted@mentacapital.com]
> Sent: Thursday, August 14, 2014 12:43 PM
> To: user@hbase.apache.org
> Cc: dev@zookeeper.apache.org
> Subject: RE: HBase client hangs after client-side OOM
>
> Hello Esteban-
>
> At the time of the ZK connection problems the client had an OOM event.
> However, the client machine overall was in fine shape looking at ganglia
> reports;  it certainly wasn't swapping or spending significant cycles on
> I/O wait.
>
> Similarly, our zookeeper server was real chilled as it always is.
>
> Regarding client configuration:
>
> <property>
>     <!--Loaded from hbase-default.xml-->
>     <name>hbase.client.pause</name>
>     <value>1000</value>
> </property>
>
> Thanks,
> Ted
>
> -----Original Message-----
> From: Esteban Gutierrez [mailto:esteban@cloudera.com]
> Sent: Thursday, August 14, 2014 10:47 AM
> To: user@hbase.apache.org
> Cc: dev@zookeeper.apache.org
> Subject: Re: HBase client hangs after client-side OOM
>
> Hi Ted,
>
> I've see this kind of client "hangs" few times when the underlying
> environment is under heavy swapping and with older versions of ZK as Rakesh
> mentioned, also when hbase.client.pause is set to 0. Do you know if your
> environment is experiencing a similar behavior with heavy IO due swapping ?
> can you also share your client configuration too?
>
> cheers,
> esteban.
>
> --
> Cloudera, Inc.
>
>
>
> On Thu, Aug 14, 2014 at 9:56 AM, Ted Tuttle <ted@mentacapital.com> wrote:
>
> > The client-side thread dump in here:
> >
> > http://pastebin.com/xU4MSq9k
> >
> > SendThread appears to be active.
> >
> > -----Original Message-----
> > From: Rakesh R [mailto:rakeshr@huawei.com]
> > Sent: Thursday, August 14, 2014 7:01 AM
> > To: dev@zookeeper.apache.org; user@hbase.apache.org
> > Subject: RE: HBase client hangs after client-side OOM
> >
> > Hi,
> >
> > >> We are running ZK 3.3.4, Cloudera cdh3u3, HBase 0.94.16.
> >
> > ZK version is quite old. I could see ClientCnxn is only catching
> > IOException and when there is OOME it will exit SendThread.
> > I think, thats the reason for client hanging. Client side threaddump
> > will help us to see the liveliness of SendThread.
> >
> > Client side exception handling has been modified in 3.4 & 3.5 branches.
> > Can you check the possibility of upgrading to 3.4.6 latest release.
> >
> > Regards,
> > Rakesh
> >
> > -----Original Message-----
> > From: Qiang Tian [mailto:tianq01@gmail.com]
> > Sent: 14 August 2014 11:03
> > To: user@hbase.apache.org; dev@zookeeper.apache.org
> > Subject: Re: HBase client hangs after client-side OOM
> >
> > the sendthread stacktrace looks not correct. Do you have the client log?
> > (in case zk client code log sth there) from the zk code, it looks
> > ClientCnxn$SendThread.run should have caught
> > it(throwable) and done the cleanup work, e.g. notify the main thread,
> > so that it can wake up from ClientCnxn.submitRequest..
> >
> > send to Zookeeper for help.
> > thanks.
> >
> >
> >
> > On Thu, Aug 14, 2014 at 11:19 AM, Ted Tuttle <ted@mentacapital.com>
> wrote:
> >
> > > Hi Lars-
> > >
> > > We are running ZK 3.3.4, Cloudera cdh3u3, HBase 0.94.16.
> > >
> > > Thanks,
> > > Ted
> > >
> > > > On Aug 13, 2014, at 5:36 PM, "lars hofhansl" <larsh@apache.org>
> wrote:
> > > >
> > > > Hey Ted,
> > > >
> > > > so this is a problem with the ZK client, it seems to not clean
> > > > itself up
> > > correctly upon receiving an exception at the wrong moment.
> > > > Which version of ZK are you using?
> > > >
> > > >
> > > > -- Lars
> > > >
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: Ted Tuttle <ted@mentacapital.com>
> > > > To: "user@hbase.apache.org" <user@hbase.apache.org>
> > > > Cc: Development <Development@mentacapital.com>
> > > > Sent: Wednesday, August 13, 2014 4:38 PM
> > > > Subject: HBase client hangs after client-side OOM
> > > >
> > > > Hello-
> > > >
> > > > We are running HBase v0.94.16 on an 8 node cluster.
> > > >
> > > > We have a recurring problem w/ HBase clients hanging.  In latest
> > > occurrence, I observed the following sequence of events:
> > > >
> > > > 0) client plays w/ HBase for a long time w/o issue
> > > > 1) client runs out of memory during HBase operation:
> > > >
> > > >                 http://pastebin.com/b5x44Lx7
> > > >
> > > > 3) Exception is thrown, memory is released
> > > > 2) In some shutdown logic the client tries to access HBase again
> > > > and
> > > hangs:
> > > >
> > > >                 http://pastebin.com/xU4MSq9k
> > > >
> > > > Clearly I need to fix OOM.  However, the fact that client hangs is
> > > > not
> > > nice.  Any ideas why?
> > > >
> > > > BTW- I started by looking at zookeeper log. Not much there but
> > > > here you
> > > go:
> > > >
> > > >                 http://pastebin.com/wZvE0Fbv
> > > >
> > > > Thanks,
> > > > Ted
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message