hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Hung <YTHu...@winbond.com>
Subject RE: What cause region server to timeout other than long gc?
Date Wed, 23 Oct 2013 07:29:07 GMT
@xieliang: I will try the PrintGCApplicationStoppedTime, thank you.

About loading, total requestPerSeconds is around 15000~30000 for 9 servers, with numberOfOnlineRegions
= 136.


I also just uploaded the log files of gc and regionserver into dropbox:

https://dl.dropboxusercontent.com/u/60149953/gc-hbase.log.20131023

https://dl.dropboxusercontent.com/u/60149953/hbase-hadoop-regionserver-fchddn2.log


My setup is:

CentOS release 6.1 (Final)

Kernel 2.6.32-131.0.15.el6.x86_64 on an x86_64

Hadoop 1.0.4

HBase 0.94.6

HBASE_REGIONSERVER_OPTS="-XX:+UseParNewGC -Xmn256m -XX:CMSInitiatingOccupancyFraction=70 -Xmx6000m
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=2
-XX:GCLogFileSize=256M -Xloggc:/data1/hadoop/gc-hbase.log"

ulimit -a:
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 62853
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65535
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32639
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


-----Original Message-----
From: 谢良 [mailto:xieliang@xiaomi.com]
Sent: Wednesday, October 23, 2013 2:54 PM
To: user@hbase.apache.org
Subject: 答复: What cause region server to timeout other than long gc?

Maybe you can try to add "-XX:+PrintGCApplicationStoppedTime", then if other ops(not gc) caused
the long safepoint duration, you could find the log.
btw, did you have a high load during that time:)

Best,
Liang

The privileged confidential information contained in this email is intended for use only by
the addressees as indicated by the original sender of this email. If you are not the addressee
indicated in this email or are not responsible for delivery of the email to such a person,
please kindly reply to the sender indicating this fact and delete all copies of it from your
computer and network server immediately. Your cooperation is highly appreciated. It is advised
that any unauthorized use of confidential information of Winbond is strictly prohibited; and
any information in this email irrelevant to the official business of Winbond shall be deemed
as neither given nor endorsed by Winbond.
Mime
View raw message