hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: HBase process not starting and uses a lot of CPU
Date Thu, 17 Dec 2015 03:53:37 GMT
I noticed Phoenix config parameters. Are Phoenix jars in place ?

Can you capture jstack of the master when this happens ?

Cheers

> On Dec 16, 2015, at 7:46 PM, F21 <f21.groups@gmail.com> wrote:
> 
> Background:
> 
> I am prototyping a HBase cluster using docker. Docker is 1.9.1 and is running in a Ubuntu
15.10 64-bit VM with access to 6GB of RAM.
> 
> Within docker, I am running 1 Zookeeper, HDFS (2.7.1) in HA mode (1 name node, 1 standby
name node, 3 journal nodes, 2 zookeeper failover controllers (colocated with namenodes) and
3 datanodes).
> 
> In terms of HBase I am running 1.1.2 and have 2 masters and 2 region servers set up to
use the HDFS cluster.
> 
> All of the above are running Oracle Java 8.
> 
> I am launching all my docker containers using docker-compose. However, I have startup
scripts in place to check that the HDFS cluster is up and safemode is off before launching
the HBase servers.
> 
> Problem:
> When launching the regionservers and masters, they do not launch reliably. Often times,
there will be one or more regionservers and masters which do not launch properly. In those
cases, the failed process will be using 100% of the CPU core it is launched on and use very
little memory (about 20 MB). The process hangs and we need to forcefully terminate it.
> 
> In the log files, we see that hbase-hdfs-master-hmaster2.log is empty and hbase-hdfs-master-hmaster2.out
contains some information but not much:
> 
> Thu Dec 17 02:37:26 UTC 2015 Starting master on hmaster2
> core file size          (blocks, -c) unlimited
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 23668
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 1048576
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 1048576
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
> 
> This is the command we are using to launch the hbase process:
> 
> sudo -u hdfs /opt/hbase/bin/hbase-daemon.sh --config /opt/hbase/conf start master
> 
> The hbase-site.xml looks like this:
> 
> <configuration>
>  <property>
>    <name>hbase.rootdir</name>
>    <value>hdfs://mycluster/hbase</value>
>  </property>
>  <property>
>    <name>zookeeper.znode.parent</name>
>    <value>/hbase</value>
>  </property>
>  <property>
>    <name>hbase.cluster.distributed</name>
>    <value>true</value>
>  </property>
>  <property>
>    <name>hbase.zookeeper.quorum</name>
>    <value>zk1</value>
>  </property>
>  <property>
>    <name>hbase.master.loadbalancer.class</name>
> <value>org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer</value>
>  </property>
>  <property>
>    <name>hbase.coprocessor.master.classes</name>
> <value>org.apache.phoenix.hbase.index.master.IndexMasterObserver</value>
>  </property>
> </configuration>
> 
> The hdfs-site.xml looks like this:
> 
> <configuration>
>  <property>
>    <name>dfs.nameservices</name>
>    <value>mycluster</value>
>  </property>
>  <property>
>    <name>dfs.ha.namenodes.mycluster</name>
>    <value>nn1,nn2</value>
>  </property>
>  <property>
>    <name>dfs.namenode.rpc-address.mycluster.nn1</name>
>    <value>nn1:8020</value>
>  </property>
>  <property>
>    <name>dfs.namenode.rpc-address.mycluster.nn2</name>
>    <value>nn2:8020</value>
>  </property>
>  <property>
>    <name>dfs.namenode.http-address.mycluster.nn1</name>
>    <value>nn1:50070</value>
>  </property>
>  <property>
>    <name>dfs.namenode.http-address.mycluster.nn2</name>
>    <value>nn2:50070</value>
>  </property>
>  <property>
> <name>dfs.client.failover.proxy.provider.mycluster</name>
> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
>  </property>
> </configuration>
> 
> The core-site.xml looks like this:
> <configuration>
> <property><name>fs.defaultFS</name><value>hdfs://mycluster</value></property></configuration>
> 
> And hbase-env.sh looks like this:
> # Set environment variables here.
> 
> # This script sets variables multiple times over the course of starting an hbase process,
> # so try to keep things idempotent unless you want to take an even deeper look
> # into the startup scripts (bin/hbase, etc.)
> 
> # The java implementation to use.  Java 1.7+ required.
> # export JAVA_HOME=/usr/java/jdk1.6.0/
> 
> # Extra Java CLASSPATH elements.  Optional.
> # export HBASE_CLASSPATH=
> 
> # The maximum amount of heap to use. Default is left to JVM default.
> # export HBASE_HEAPSIZE=1G
> 
> # Uncomment below if you intend to use off heap cache. For example, to allocate 8G of
> # offheap, set the value to "8G".
> # export HBASE_OFFHEAPSIZE=1G
> 
> # Extra Java runtime options.
> # Below are what we set by default.  May only work with SUN JVM.
> # For more on why as well as other possible settings,
> # see http://wiki.apache.org/hadoop/PerformanceTuning
> export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
> 
> # Configure PermSize. Only needed in JDK7. You can safely remove it for JDK8+
> export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m"
> export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m"
> 
> # Uncomment one of the below three options to enable java garbage collection logging
for the server-side processes.
> 
> # This enables basic gc logging to the .out file.
> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"
> 
> # This enables basic gc logging to its own file.
> # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR
.
> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
> 
> # This enables basic GC logging to its own file with automatic log rolling. Only applies
to jdk 1.6.0_34+ and 1.7.0_2+.
> # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR
.
> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
> 
> # Uncomment one of the below three options to enable java garbage collection logging
for the client processes.
> 
> # This enables basic gc logging to the .out file.
> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"
> 
> # This enables basic gc logging to its own file.
> # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR
.
> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
> 
> # This enables basic GC logging to its own file with automatic log rolling. Only applies
to jdk 1.6.0_34+ and 1.7.0_2+.
> # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR
.
> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
> 
> # See the package documentation for org.apache.hadoop.hbase.io.hfile for other configurations
> # needed setting up off-heap block caching.
> 
> # Uncomment and adjust to enable JMX exporting
> # See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure
remote password access.
> # More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
> # NOTE: HBase provides an alternative JMX implementation to fix the random ports issue,
please see JMX
> # section in HBase Reference Guide for instructions.
> 
> # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"
> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"
> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"
> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"
> # export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105"
> 
> # File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers
by default.
> # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
> 
> # Uncomment and adjust to keep all the Region Server pages mapped to be memory resident
> #HBASE_REGIONSERVER_MLOCK=true
> #HBASE_REGIONSERVER_UID="hbase"
> 
> # File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters
by default.
> # export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
> 
> # Extra ssh options.  Empty by default.
> # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"
> 
> # Where log files are stored.  $HBASE_HOME/logs by default.
> # export HBASE_LOG_DIR=${HBASE_HOME}/logs
> 
> # Enable remote JDWP debugging of major HBase processes. Meant for Core Developers
> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"
> 
> # A string representing this instance of hbase. $USER by default.
> # export HBASE_IDENT_STRING=$USER
> 
> # The scheduling priority for daemon processes.  See 'man nice'.
> # export HBASE_NICENESS=10
> 
> # The directory where pid files are stored. /tmp by default.
> # export HBASE_PID_DIR=/var/hadoop/pids
> 
> # Seconds to sleep between slave commands.  Unset by default.  This
> # can be useful in large clusters, where, e.g., slave rsyncs can
> # otherwise arrive faster than the master can service them.
> # export HBASE_SLAVE_SLEEP=0.1
> 
> # Tell HBase whether it should manage it's own instance of Zookeeper or not.
> # export HBASE_MANAGES_ZK=true
> 
> # The default log rolling policy is RFA, where the log file is rolled as per the size
defined for the
> # RFA appender. Please refer to the log4j.properties file to see more details on this
appender.
> # In case one needs to do log rolling on a date change, one should set the environment
property
> # HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
> # For example:
> # HBASE_ROOT_LOGGER=INFO,DRFA
> # The reason for changing default to RFA is to avoid the boundary case of filling out
disk space as
> # DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context.
> export HBASE_LOG_DIR=/var/log/hbase
> export HBASE_PID_DIR=/var/run/hbase
> export JAVA_HOME=/usr/lib/jvm/java-8-oracle
> 
> The server still has plenty of ram available (1GB).
> 
> It's not clear what is causing this as the logs are pretty sparse. Have any of you seen
a problem like this before?

Mime
View raw message