hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From F21 <f21.gro...@gmail.com>
Subject HBase process not starting and uses a lot of CPU
Date Thu, 17 Dec 2015 03:46:20 GMT
Background:

I am prototyping a HBase cluster using docker. Docker is 1.9.1 and is 
running in a Ubuntu 15.10 64-bit VM with access to 6GB of RAM.

Within docker, I am running 1 Zookeeper, HDFS (2.7.1) in HA mode (1 name 
node, 1 standby name node, 3 journal nodes, 2 zookeeper failover 
controllers (colocated with namenodes) and 3 datanodes).

In terms of HBase I am running 1.1.2 and have 2 masters and 2 region 
servers set up to use the HDFS cluster.

All of the above are running Oracle Java 8.

I am launching all my docker containers using docker-compose. However, I 
have startup scripts in place to check that the HDFS cluster is up and 
safemode is off before launching the HBase servers.

Problem:
When launching the regionservers and masters, they do not launch 
reliably. Often times, there will be one or more regionservers and 
masters which do not launch properly. In those cases, the failed process 
will be using 100% of the CPU core it is launched on and use very little 
memory (about 20 MB). The process hangs and we need to forcefully 
terminate it.

In the log files, we see that hbase-hdfs-master-hmaster2.log is empty 
and hbase-hdfs-master-hmaster2.out contains some information but not much:

Thu Dec 17 02:37:26 UTC 2015 Starting master on hmaster2
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 23668
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1048576
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1048576
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

This is the command we are using to launch the hbase process:

sudo -u hdfs /opt/hbase/bin/hbase-daemon.sh --config /opt/hbase/conf 
start master

The hbase-site.xml looks like this:

<configuration>
   <property>
     <name>hbase.rootdir</name>
     <value>hdfs://mycluster/hbase</value>
   </property>
   <property>
     <name>zookeeper.znode.parent</name>
     <value>/hbase</value>
   </property>
   <property>
     <name>hbase.cluster.distributed</name>
     <value>true</value>
   </property>
   <property>
     <name>hbase.zookeeper.quorum</name>
     <value>zk1</value>
   </property>
   <property>
     <name>hbase.master.loadbalancer.class</name>
<value>org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer</value>
   </property>
   <property>
     <name>hbase.coprocessor.master.classes</name>
<value>org.apache.phoenix.hbase.index.master.IndexMasterObserver</value>
   </property>
</configuration>

The hdfs-site.xml looks like this:

<configuration>
   <property>
     <name>dfs.nameservices</name>
     <value>mycluster</value>
   </property>
   <property>
     <name>dfs.ha.namenodes.mycluster</name>
     <value>nn1,nn2</value>
   </property>
   <property>
     <name>dfs.namenode.rpc-address.mycluster.nn1</name>
     <value>nn1:8020</value>
   </property>
   <property>
     <name>dfs.namenode.rpc-address.mycluster.nn2</name>
     <value>nn2:8020</value>
   </property>
   <property>
     <name>dfs.namenode.http-address.mycluster.nn1</name>
     <value>nn1:50070</value>
   </property>
   <property>
     <name>dfs.namenode.http-address.mycluster.nn2</name>
     <value>nn2:50070</value>
   </property>
   <property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
   </property>
</configuration>

The core-site.xml looks like this:
<configuration>
<property><name>fs.defaultFS</name><value>hdfs://mycluster</value></property></configuration>

And hbase-env.sh looks like this:
# Set environment variables here.

# This script sets variables multiple times over the course of starting 
an hbase process,
# so try to keep things idempotent unless you want to take an even 
deeper look
# into the startup scripts (bin/hbase, etc.)

# The java implementation to use.  Java 1.7+ required.
# export JAVA_HOME=/usr/java/jdk1.6.0/

# Extra Java CLASSPATH elements.  Optional.
# export HBASE_CLASSPATH=

# The maximum amount of heap to use. Default is left to JVM default.
# export HBASE_HEAPSIZE=1G

# Uncomment below if you intend to use off heap cache. For example, to 
allocate 8G of
# offheap, set the value to "8G".
# export HBASE_OFFHEAPSIZE=1G

# Extra Java runtime options.
# Below are what we set by default.  May only work with SUN JVM.
# For more on why as well as other possible settings,
# see http://wiki.apache.org/hadoop/PerformanceTuning
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"

# Configure PermSize. Only needed in JDK7. You can safely remove it for 
JDK8+
export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m 
-XX:MaxPermSize=128m"
export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS 
-XX:PermSize=128m -XX:MaxPermSize=128m"

# Uncomment one of the below three options to enable java garbage 
collection logging for the server-side processes.

# This enables basic gc logging to the .out file.
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be 
generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log 
rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be 
generated in the HBASE_LOG_DIR .
# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# Uncomment one of the below three options to enable java garbage 
collection logging for the client processes.

# This enables basic gc logging to the .out file.
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.
# If FILE-PATH is not replaced, the log file(.gc) would still be 
generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log 
rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
# If FILE-PATH is not replaced, the log file(.gc) would still be 
generated in the HBASE_LOG_DIR .
# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# See the package documentation for org.apache.hadoop.hbase.io.hfile for 
other configurations
# needed setting up off-heap block caching.

# Uncomment and adjust to enable JMX exporting
# See jmxremote.password and jmxremote.access in 
$JRE_HOME/lib/management to configure remote password access.
# More details at: 
http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
# NOTE: HBase provides an alternative JMX implementation to fix the 
random ports issue, please see JMX
# section in HBase Reference Guide for instructions.

# export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false"
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE 
-Dcom.sun.management.jmxremote.port=10101"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS 
$HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE 
-Dcom.sun.management.jmxremote.port=10103"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE 
-Dcom.sun.management.jmxremote.port=10104"
# export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE 
-Dcom.sun.management.jmxremote.port=10105"

# File naming hosts on which HRegionServers will run. 
$HBASE_HOME/conf/regionservers by default.
# export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers

# Uncomment and adjust to keep all the Region Server pages mapped to be 
memory resident
#HBASE_REGIONSERVER_MLOCK=true
#HBASE_REGIONSERVER_UID="hbase"

# File naming hosts on which backup HMaster will run. 
$HBASE_HOME/conf/backup-masters by default.
# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters

# Extra ssh options.  Empty by default.
# export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"

# Where log files are stored.  $HBASE_HOME/logs by default.
# export HBASE_LOG_DIR=${HBASE_HOME}/logs

# Enable remote JDWP debugging of major HBase processes. Meant for Core 
Developers
# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug 
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug 
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug 
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug 
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"

# A string representing this instance of hbase. $USER by default.
# export HBASE_IDENT_STRING=$USER

# The scheduling priority for daemon processes.  See 'man nice'.
# export HBASE_NICENESS=10

# The directory where pid files are stored. /tmp by default.
# export HBASE_PID_DIR=/var/hadoop/pids

# Seconds to sleep between slave commands.  Unset by default.  This
# can be useful in large clusters, where, e.g., slave rsyncs can
# otherwise arrive faster than the master can service them.
# export HBASE_SLAVE_SLEEP=0.1

# Tell HBase whether it should manage it's own instance of Zookeeper or not.
# export HBASE_MANAGES_ZK=true

# The default log rolling policy is RFA, where the log file is rolled as 
per the size defined for the
# RFA appender. Please refer to the log4j.properties file to see more 
details on this appender.
# In case one needs to do log rolling on a date change, one should set 
the environment property
# HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
# For example:
# HBASE_ROOT_LOGGER=INFO,DRFA
# The reason for changing default to RFA is to avoid the boundary case 
of filling out disk space as
# DRFA doesn't put any cap on the log size. Please refer to HBase-5655 
for more context.
export HBASE_LOG_DIR=/var/log/hbase
export HBASE_PID_DIR=/var/run/hbase
export JAVA_HOME=/usr/lib/jvm/java-8-oracle

The server still has plenty of ram available (1GB).

It's not clear what is causing this as the logs are pretty sparse. Have 
any of you seen a problem like this before?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message