hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wayne <wav...@gmail.com>
Subject Re: SocketTimeoutException caused by GC?
Date Thu, 27 Jan 2011 19:38:57 GMT
I assumed GC was *trying* to roll. It shows the last 30min of logs with
control characters at the end.

We are not all writes. In terms of writes we can wait and the zookeeper
timeout can go way up, but we also need to support real-time reads (end user
based) and that is why the zookeeper timeout is not our first choice to
increase (we would rather decrease it). The funny part is that .90 seems
faster for us and churns through writes at a faster clip thereby probably
becoming less stable sooner due to the JVM not being able to handle it.
Should we schedule a rolling restart every 24 hours? How do production
systems accept volume writes through the front door without melting the JVM
due to fragmentation? We can possibly switch to bulk writes but performance
is not our problem...stability is. We are pushing 40k writes/node/sec
sustained with well balanced regions hour after hour day after day (until a
zookeeper tear down).

Great to hear it is actively being looked at. I will keep an eye on #3455.

Below are our GC options, many of which are from work with the other java
database. Should I go back to the default settings? Should I use those
referenced in the Jira #3455 (-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=65 -Xms8g -Xmx8g). We are also using

export HBASE_HEAPSIZE=8192
export HBASE_OPTS="-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=60 -XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8 -XX:NewRatio=3 -XX:MaxTenuringThreshold=1
-XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC
export HBASE_OPTS="$HBASE_OPTS -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -Xloggc:$HBASE_HOME/logs/gc-hbase.log"

Thanks for your help!

On Thu, Jan 27, 2011 at 1:53 PM, Stack <stack@duboce.net> wrote:

> On Thu, Jan 27, 2011 at 6:48 AM, Wayne <wav100@gmail.com> wrote:
> > We have got .90 up and running well, but again after 24 hours of loading
> a
> > node went down. Under it all I assume it is a GC issue, but the GC
> logging
> > rolls every < 60 minutes so I can never see logs from 5 hours ago
> (working
> > on getting Scribe up to solve that).
> Is it possible to ask GC logs roll?  Whats the config. please?
> > Most of our issues are a node being
> > marked as dead after being un-responsive. It often starts with a socket
> > timeout. We can turn up the timeout for zookeeper but that is not dealing
> > with the issue.
> >
> > Here is the first sign of trouble. Is the 1 min 34 second gap below most
> > likely a stop the world GC?
> >
> > 2011-01-27 07:00:43,716 INFO
> org.apache.hadoop.hbase.regionserver.wal.HLog:
> > Roll
> /hbase/.logs/x.x.x.6,60020,1295969329357/x.x.x.6%3A60020.1296111623011,
> > entries=242, filesize=69508440. New hlog
> > /hbase/.logs/x.x.x.6,60020,1295969329357/x.x.x.6%3A60020.1296111643436
> > 2011-01-27 07:02:17,663 WARN org.apache.hadoop.hdfs.DFSClient:
> > DFSOutputStream ResponseProcessor exception for block
> > blk_-5705652521953118952_104835java.net.SocketTimeoutException: 69000
> millis
> > timeout while waiting for channel to be ready for read. ch :
> > java.nio.channels.SocketChannel[connected local=/x.x.x.6:48141
> > remote=/x.x.x.6:50010]
> >
> Possibly.  Whats the log history like before that?
> What was going on in the local DataNode at that time?  Seems like we
> failed talking to it (I'm guessing x.x.x.6 the local datanode).  Could
> be though because of GC pause.
> > It is followed by zookeeper complaining due to lack of a response.
> >
> > 2011-01-27 07:02:17,665 INFO org.apache.zookeeper.ClientCnxn: Client
> session
> > timed out, have not heard from server in 94590ms for sessionid
> > 0x2dbdc88700000e, closing socket connection and attempting reconnect
> >
> This is strong indicator that there was a long GC pause around this time.
> > There is also a message about the data node.
> >
> > 2011-01-27 07:02:17,665 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > Recovery for block blk_4267667433820446273_104837 bad datanode[0]
> > x.x.x.6:50010
> >
> This is probably just the DFSClient up in the RegionServer marking the
> local datanode bad because of the above timeout talking to it
> (possibly because of GC pause).  Its marking the first datanode in the
> pipeline, probably the first, as bad.. Its probably going to try the
> next node in the pipeline.
> > And eventually the node is brought down.
> >
> > 2011-01-27 07:02:17,783 FATAL
> > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region
> > server...
> >
> This happened because session expired on zk?  You should be able to
> see up earlier in the log.
> > The data node also shows some errors.
> >
> > 2011-01-27 07:02:17,667 ERROR
> > org.apache.hadoop.hdfs.server.datanode.DataNode:
> > DatanodeRegistration(x.x.x.6:50010,
> > storageID=DS-1438948528-x.x.x.6-50010-1295969305669, infoPort=50075,
> > ipcPort=50020):DataXceiver java.net.SocketException: Connection reset
> >
> Oh, you did check the DN logs.
> So, does this look like DN logging the RS dropping the connection
> because of timeout?  (Check the IPs -- see if it makes sense reading
> it that way).
> >
> > Any help, advice, ideas, or guesses would be greatly appreciated. Can
> anyone
> > sustain 30-40k writes/node/sec for days/weeks on end without using the
> bulk
> > loader? Am I rolling a rock uphill against the reality of the JVM?
> >
> You are going to run into heap fragmentation that brings on the
> stop-the-world GCs.  Some good progress has been made on this
> high-priority issue recently, see
> https://issues.apache.org/jira/browse/HBASE-3455, but a 'solution' is
> still to do.  Even then, my guess is that we would not be able to
> guarantee our never running into a stop-the-world GC.
> Are you all about writes?  If so, the bulk loader runs at one to two
> orders of magnitude faster than going via API.  For this reason alone,
> its worth exploring (You can do incremental bulk loads to a running
> table).  If reads are rare, you might down the size of your heap so
> the stop-the-worlds run quicker or you try giving HBase a really large
> heap; perhaps the elbow room will make it so we avoid fragmentation
> (but it might make it worse).   Reads and reading from cache do not
> seem to bring on the stop-the-world GC the way writes do.  Are your
> writes latency sensitive such that you can't tolerate the occasional
> long pause?  If not, up your zk lease timeouts.
> St.Ack

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message