hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sumit Nigam <sumit_o...@yahoo.com.INVALID>
Subject Re: Region server crashes after GC pause time
Date Fri, 22 Apr 2016 13:39:17 GMT
Hi Karthik,
As Ding said, eventually you need to watch your heap objects to understand what produces so
many objects in the first place.
CMS occupancy % defaults to 70%. Default values of block cache and in-memory memstore sizes
add up to around 75-80%. Which means you could get CMS to start walking your heap with just
these 2. And we are not even talking hbase scan objects, etc. If even after analysis of your
heap you are unable to figure a way out, then you may want to consider G1 collector (esp if
you are having > 15-20G heaps). 
It may also give you temporary respite by increasing zookeeper max session timeout value as
that makes region server be 'alive' on zookeeper longer. The downside is that it may now take
a little longer to mark DEAD a really dead server and allows zookeeper clients to now negotiate
a longer time out. If that is possible or not depends on your use case really.
You could also put block cache off-heap - that is almost 40% saving of heap. But then, be
very sure that you are unable to understand your heap data in region servers. You may also
benefit by lowering your heap size and launching more region servers instead.
Thanks,Sumit

      From: karthi keyan <karthi93.sankar@gmail.com>
 To: user@hbase.apache.org; dingdongchao@baidu.com 
 Sent: Friday, April 22, 2016 5:34 PM
 Subject: Re: Region server crashes after GC pause time
   
Hi Ding,

I have increased the Heap to 2G , still getting out of memory exception .
Actually i had write the data to HBase at 40K writes/sec .
 Is there any parameter to tune up , as my knowledge "-
XX:CMSInitiatingOccupancyFraction=N " i tuned like this in HBase.
Is there any other parameter required to resolve this issue???

Best,
Karthik

On Thu, Apr 14, 2016 at 12:21 PM, Ding,Dongchao <dingdongchao@baidu.com>
wrote:

> Dump the jvm heap,analysis the the heap and find which query(s) cost so
> many memory?
> In my ever bad case,the RS crashed for Long GC pauses because of a big
> query on Batch Get operation.
>
>
> In addition,I think you can increase the men of JVM, 512m is so small for
> RS.
>
>
>
> 在 16/4/14 14:00, "karthi keyan" <karthi93.sankar@gmail.com> 写入:
>
> >Hi ,
> >
> >i got this issue in HBase while at peak time handling more requests . can
> >any one pls guide me to resolve the Long GC pauses in hbase .
> >
> >JDK-7 , JVM heap 512m
> >
> >HBase 0.98.13
> >
> >
> > INFO  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or
> >host machine (eg GC): pause of approximately 1466ms
> >GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=1967ms
> > INFO  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or
> >host machine (eg GC): pause of approximately 2304ms
> >GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=2775ms
> > INFO  [JvmPauseMonitor] util.JvmPauseMonitor: Detected pause in JVM or
> >host machine (eg GC): pause of approximately 2287ms
> >GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=2775ms
> >
> > INFO  [RS:0;0:0:0:0:0:0:0:0:44037-SendThread(<host>:2181)]
> >zookeeper.ClientCnxn: Client session timed out, have not heard from server
> >in 6819ms for sessionid 0x1540ab48b280004, closing socket connection and
> >attempting reconnect
> > INFO  [SplitLogWorker-<host>,44037,1460468489645-SendThread(<host>:2181)]
> >zookeeper.ClientCnxn: Client session timed out, have not heard from server
> >in 6819ms for sessionid 0x1540ab48b280005, closing socket connection and
> >attempting reconnect
> >
> >Once after this HBase Region Server moved to Dead state.
> >
> >Best,
> >Karthik
>
>

  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message