hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ameya Kantikar <am...@groupon.com>
Subject Re: Region servers going down under heavy write load
Date Thu, 06 Jun 2013 00:45:48 GMT
One more thing. I just dont find this "hbase.zookeeper.property.tickTime"
anywhere in the code base.
Also, I could not find ZooKeeper API that takes tickTime from client.
http://zookeeper.apache.org/doc/r3.3.3/api/org/apache/zookeeper/ZooKeeper.html
It takes sessionTime out value, but not tickTime.

Is this even relevant anymore? hbase.zookeeper.property.tickTime ?

So whats the solution, increase tickTime in zoo.cfg? (and not
hbase.zookeeper.property.tickTime
in hbase-site.xml?)

Ameya


On Wed, Jun 5, 2013 at 3:18 PM, Ameya Kantikar <ameya@groupon.com> wrote:

> Which tickTime is honored?
>
> One in zoo.cfg or hbase.zookeeper.property.tickTime in hbase-site.xml?
>
> My understanding now is, whichever tickTime is honored, session time can
> not be more than 20 times the value.
>
> I think this is whats happening on my cluster:
>
> My hbase.zookeeper.property.tickTime value is 6000 ms. However my timeout
> value is 300000 ms which is outside of 20 times tickTime. Hence ZooKeeper
> uses its syncLimit of 5, to generate 6000*5 = 30000 as timeout value for my
> RS sessions.
>
> I'll try increasing hbase.zookeeper.property.tickTime value in
> hbase-site.xml and will monitor my cluster over next few days.
>
> Thanks Kevin & Ted for your help.
>
> Ameya
>
>
>
>
> On Wed, Jun 5, 2013 at 2:45 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> bq. I thought this property in hbase-site.xml takes care of that:
>> zookeeper.session.timeout
>>
>> From
>>
>> http://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#ch_zkSessions
>> :
>>
>> The client sends a requested timeout, the server responds with the timeout
>> that it can give the client. The current implementation requires that the
>> timeout be a minimum of 2 times the tickTime (as set in the server
>> configuration) and a maximum of 20 times the tickTime. The ZooKeeper
>> client
>> API allows access to the negotiated timeout.
>> The above means the shared zookeeper quorum may return timeout value
>> different from that of zookeeper.session.timeout
>>
>> Cheers
>>
>> On Wed, Jun 5, 2013 at 2:34 PM, Ameya Kantikar <ameya@groupon.com> wrote:
>>
>> > In zoo.cfg I have not setup this value explicitly. My zoo.cfg looks
>> like:
>> >
>> > tickTime=2000
>> > initLimit=10
>> > syncLimit=5
>> >
>> > We use common zoo keeper cluster for 2 of our HBase clusters. I'll try
>> > increasing this value from zoo.cfg.
>> > However is it possible to set this value cluster specific?
>> > I thought this property in hbase-site.xml takes care of that:
>> > zookeeper.session.timeout
>> >
>> >
>> > On Wed, Jun 5, 2013 at 1:49 PM, Kevin O'dell <kevin.odell@cloudera.com
>> > >wrote:
>> >
>> > > Ameya,
>> > >
>> > >   What does your zoo.cfg say for your timeout value?
>> > >
>> > >
>> > > On Wed, Jun 5, 2013 at 4:47 PM, Ameya Kantikar <ameya@groupon.com>
>> > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > We have heavy map reduce write jobs running against our cluster.
>> Every
>> > > once
>> > > > in a while, we see a region server going down.
>> > > >
>> > > > We are on : 0.94.2-cdh4.2.0, r
>> > > >
>> > > > We have done some tuning for heavy map reduce jobs, and have
>> increased
>> > > > scanner timeouts, lease timeouts, have also tuned memstore as
>> follows:
>> > > >
>> > > > hbase.hregion.memstore.block.multiplier: 4
>> > > > hbase.hregion.memstore.flush.size: 134217728
>> > > > hbase.hstore.blockingStoreFiles: 100
>> > > >
>> > > > So now, we are still facing issues. Looking at the logs it looks
>> like
>> > due
>> > > > to zoo keeper timeout. We have tuned zookeeper settings as follows
>> on
>> > > > hbase-sie.xml:
>> > > >
>> > > > zookeeper.session.timeout: 300000
>> > > > hbase.zookeeper.property.tickTime: 6000
>> > > >
>> > > >
>> > > > The actual log looks like:
>> > > >
>> > > >
>> > > > 2013-06-05 11:46:40,405 WARN org.apache.hadoop.ipc.HBaseServer:
>> > > > (responseTooSlow):
>> > > > {"processingtimems":13468,"call":"next(6723331143689528698, 1000),
>> rpc
>> > > > version=1, client version=29,
>> methodsFingerPrint=54742778","client":"
>> > > > 10.20.73.65:41721
>> > > >
>> > > >
>> > >
>> >
>> ","starttimems":1370432786933,"queuetimems":1,"class":"HRegionServer","responsesize":39611416,"method":"next"}
>> > > >
>> > > > 2013-06-05 11:46:54,988 INFO
>> org.apache.hadoop.io.compress.CodecPool:
>> > Got
>> > > > brand-new decompressor [.snappy]
>> > > >
>> > > > 2013-06-05 11:48:03,017 WARN org.apache.hadoop.hdfs.DFSClient:
>> > > > DFSOutputStream ResponseProcessor exception  for block
>> > > >
>> BP-53741567-10.20.73.56-1351630463427:blk_9026156240355850298_8775246
>> > > > java.io.EOFException: Premature EOF: no length prefix available
>> > > >         at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
>> > > >         at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
>> > > >         at
>> > > >
>> > > >
>> > >
>> >
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:656)
>> > > >
>> > > > 2013-06-05 11:48:03,020 WARN org.apache.hadoop.hbase.util.Sleeper:
>> *We
>> > > > slept 48686ms instead of 3000ms*, this is likely due to a long
>> garbage
>> > > > collecting pause and it's usually bad, see
>> > > > http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
>> > > >
>> > > > 2013-06-05 11:48:03,094 FATAL
>> > > > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region
>> > > server
>> > > > smartdeals-hbase14-snc1.snc1,60020,1370373396890: Unhandled
>> exception:
>> > > > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected;
>> > > > currently processing
>> smartdeals-hbase14-snc1.snc1,60020,1370373396890
>> > as
>> > > > dead server
>> > > >
>> > > > (Not sure why it says 3000ms when we have timeout at 300000ms)
>> > > >
>> > > > We have done some GC tuning as well. Wondering what I can tune from
>> > > making
>> > > > RS going down? Any ideas?
>> > > > This is batch heavy cluster, and we care less about read latency.
We
>> > can
>> > > > increase RAM bit more but not much (Already RS has 20GB memory)
>> > > >
>> > > > Thanks in advance.
>> > > >
>> > > > Ameya
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Kevin O'Dell
>> > > Systems Engineer, Cloudera
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message