hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: HBase process not starting and uses a lot of CPU
Date Thu, 17 Dec 2015 04:01:09 GMT
Type the following command:
which java

You should get a path, say <some-path>/bin/java
jstack is under <some-path>/bin/

issue the following command:
ps aux | grep aster

You should see the pid of master process

issue the following:
<some-path>/bin/jstack <pid>

Cheers

On Wed, Dec 16, 2015 at 7:55 PM, F21 <f21.groups@gmail.com> wrote:

> Yes, the phoenix jars are in place and everything (including phoenix)
> works properly if the hbase server launches correctly. I was seeing this
> problem before I added phoenix to the mix.
>
> Can you provide instructions on how to capture the jstack (not very
> familiar with java)?
>
> Cheers
>
>
> On 17/12/2015 2:53 PM, Ted Yu wrote:
>
>> I noticed Phoenix config parameters. Are Phoenix jars in place ?
>>
>> Can you capture jstack of the master when this happens ?
>>
>> Cheers
>>
>> On Dec 16, 2015, at 7:46 PM, F21 <f21.groups@gmail.com> wrote:
>>>
>>> Background:
>>>
>>> I am prototyping a HBase cluster using docker. Docker is 1.9.1 and is
>>> running in a Ubuntu 15.10 64-bit VM with access to 6GB of RAM.
>>>
>>> Within docker, I am running 1 Zookeeper, HDFS (2.7.1) in HA mode (1 name
>>> node, 1 standby name node, 3 journal nodes, 2 zookeeper failover
>>> controllers (colocated with namenodes) and 3 datanodes).
>>>
>>> In terms of HBase I am running 1.1.2 and have 2 masters and 2 region
>>> servers set up to use the HDFS cluster.
>>>
>>> All of the above are running Oracle Java 8.
>>>
>>> I am launching all my docker containers using docker-compose. However, I
>>> have startup scripts in place to check that the HDFS cluster is up and
>>> safemode is off before launching the HBase servers.
>>>
>>> Problem:
>>> When launching the regionservers and masters, they do not launch
>>> reliably. Often times, there will be one or more regionservers and masters
>>> which do not launch properly. In those cases, the failed process will be
>>> using 100% of the CPU core it is launched on and use very little memory
>>> (about 20 MB). The process hangs and we need to forcefully terminate it.
>>>
>>> In the log files, we see that hbase-hdfs-master-hmaster2.log is empty
>>> and hbase-hdfs-master-hmaster2.out contains some information but not much:
>>>
>>> Thu Dec 17 02:37:26 UTC 2015 Starting master on hmaster2
>>> core file size          (blocks, -c) unlimited
>>> data seg size           (kbytes, -d) unlimited
>>> scheduling priority             (-e) 0
>>> file size               (blocks, -f) unlimited
>>> pending signals                 (-i) 23668
>>> max locked memory       (kbytes, -l) 64
>>> max memory size         (kbytes, -m) unlimited
>>> open files                      (-n) 1048576
>>> pipe size            (512 bytes, -p) 8
>>> POSIX message queues     (bytes, -q) 819200
>>> real-time priority              (-r) 0
>>> stack size              (kbytes, -s) 8192
>>> cpu time               (seconds, -t) unlimited
>>> max user processes              (-u) 1048576
>>> virtual memory          (kbytes, -v) unlimited
>>> file locks                      (-x) unlimited
>>>
>>> This is the command we are using to launch the hbase process:
>>>
>>> sudo -u hdfs /opt/hbase/bin/hbase-daemon.sh --config /opt/hbase/conf
>>> start master
>>>
>>> The hbase-site.xml looks like this:
>>>
>>> <configuration>
>>>   <property>
>>>     <name>hbase.rootdir</name>
>>>     <value>hdfs://mycluster/hbase</value>
>>>   </property>
>>>   <property>
>>>     <name>zookeeper.znode.parent</name>
>>>     <value>/hbase</value>
>>>   </property>
>>>   <property>
>>>     <name>hbase.cluster.distributed</name>
>>>     <value>true</value>
>>>   </property>
>>>   <property>
>>>     <name>hbase.zookeeper.quorum</name>
>>>     <value>zk1</value>
>>>   </property>
>>>   <property>
>>>     <name>hbase.master.loadbalancer.class</name>
>>> <value>org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer</value>
>>>   </property>
>>>   <property>
>>>     <name>hbase.coprocessor.master.classes</name>
>>> <value>org.apache.phoenix.hbase.index.master.IndexMasterObserver</value>
>>>   </property>
>>> </configuration>
>>>
>>> The hdfs-site.xml looks like this:
>>>
>>> <configuration>
>>>   <property>
>>>     <name>dfs.nameservices</name>
>>>     <value>mycluster</value>
>>>   </property>
>>>   <property>
>>>     <name>dfs.ha.namenodes.mycluster</name>
>>>     <value>nn1,nn2</value>
>>>   </property>
>>>   <property>
>>>     <name>dfs.namenode.rpc-address.mycluster.nn1</name>
>>>     <value>nn1:8020</value>
>>>   </property>
>>>   <property>
>>>     <name>dfs.namenode.rpc-address.mycluster.nn2</name>
>>>     <value>nn2:8020</value>
>>>   </property>
>>>   <property>
>>>     <name>dfs.namenode.http-address.mycluster.nn1</name>
>>>     <value>nn1:50070</value>
>>>   </property>
>>>   <property>
>>>     <name>dfs.namenode.http-address.mycluster.nn2</name>
>>>     <value>nn2:50070</value>
>>>   </property>
>>>   <property>
>>> <name>dfs.client.failover.proxy.provider.mycluster</name>
>>>
>>> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
>>>   </property>
>>> </configuration>
>>>
>>> The core-site.xml looks like this:
>>> <configuration>
>>>
>>> <property><name>fs.defaultFS</name><value>hdfs://mycluster</value></property></configuration>
>>>
>>> And hbase-env.sh looks like this:
>>> # Set environment variables here.
>>>
>>> # This script sets variables multiple times over the course of starting
>>> an hbase process,
>>> # so try to keep things idempotent unless you want to take an even
>>> deeper look
>>> # into the startup scripts (bin/hbase, etc.)
>>>
>>> # The java implementation to use.  Java 1.7+ required.
>>> # export JAVA_HOME=/usr/java/jdk1.6.0/
>>>
>>> # Extra Java CLASSPATH elements.  Optional.
>>> # export HBASE_CLASSPATH=
>>>
>>> # The maximum amount of heap to use. Default is left to JVM default.
>>> # export HBASE_HEAPSIZE=1G
>>>
>>> # Uncomment below if you intend to use off heap cache. For example, to
>>> allocate 8G of
>>> # offheap, set the value to "8G".
>>> # export HBASE_OFFHEAPSIZE=1G
>>>
>>> # Extra Java runtime options.
>>> # Below are what we set by default.  May only work with SUN JVM.
>>> # For more on why as well as other possible settings,
>>> # see http://wiki.apache.org/hadoop/PerformanceTuning
>>> export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
>>>
>>> # Configure PermSize. Only needed in JDK7. You can safely remove it for
>>> JDK8+
>>> export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m
>>> -XX:MaxPermSize=128m"
>>> export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS
>>> -XX:PermSize=128m -XX:MaxPermSize=128m"
>>>
>>> # Uncomment one of the below three options to enable java garbage
>>> collection logging for the server-side processes.
>>>
>>> # This enables basic gc logging to the .out file.
>>> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps"
>>>
>>> # This enables basic gc logging to its own file.
>>> # If FILE-PATH is not replaced, the log file(.gc) would still be
>>> generated in the HBASE_LOG_DIR .
>>> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
>>>
>>> # This enables basic GC logging to its own file with automatic log
>>> rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
>>> # If FILE-PATH is not replaced, the log file(.gc) would still be
>>> generated in the HBASE_LOG_DIR .
>>> # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation
>>> -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
>>>
>>> # Uncomment one of the below three options to enable java garbage
>>> collection logging for the client processes.
>>>
>>> # This enables basic gc logging to the .out file.
>>> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps"
>>>
>>> # This enables basic gc logging to its own file.
>>> # If FILE-PATH is not replaced, the log file(.gc) would still be
>>> generated in the HBASE_LOG_DIR .
>>> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
>>>
>>> # This enables basic GC logging to its own file with automatic log
>>> rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
>>> # If FILE-PATH is not replaced, the log file(.gc) would still be
>>> generated in the HBASE_LOG_DIR .
>>> # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails
>>> -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation
>>> -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
>>>
>>> # See the package documentation for org.apache.hadoop.hbase.io.hfile
>>> for other configurations
>>> # needed setting up off-heap block caching.
>>>
>>> # Uncomment and adjust to enable JMX exporting
>>> # See jmxremote.password and jmxremote.access in
>>> $JRE_HOME/lib/management to configure remote password access.
>>> # More details at:
>>> http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html
>>> # NOTE: HBase provides an alternative JMX implementation to fix the
>>> random ports issue, please see JMX
>>> # section in HBase Reference Guide for instructions.
>>>
>>> # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false
>>> -Dcom.sun.management.jmxremote.authenticate=false"
>>> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE
>>> -Dcom.sun.management.jmxremote.port=10101"
>>> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS
>>> $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"
>>> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE
>>> -Dcom.sun.management.jmxremote.port=10103"
>>> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE
>>> -Dcom.sun.management.jmxremote.port=10104"
>>> # export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE
>>> -Dcom.sun.management.jmxremote.port=10105"
>>>
>>> # File naming hosts on which HRegionServers will run.
>>> $HBASE_HOME/conf/regionservers by default.
>>> # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
>>>
>>> # Uncomment and adjust to keep all the Region Server pages mapped to be
>>> memory resident
>>> #HBASE_REGIONSERVER_MLOCK=true
>>> #HBASE_REGIONSERVER_UID="hbase"
>>>
>>> # File naming hosts on which backup HMaster will run.
>>> $HBASE_HOME/conf/backup-masters by default.
>>> # export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
>>>
>>> # Extra ssh options.  Empty by default.
>>> # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"
>>>
>>> # Where log files are stored.  $HBASE_HOME/logs by default.
>>> # export HBASE_LOG_DIR=${HBASE_HOME}/logs
>>>
>>> # Enable remote JDWP debugging of major HBase processes. Meant for Core
>>> Developers
>>> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug
>>> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
>>> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug
>>> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
>>> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug
>>> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
>>> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug
>>> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"
>>>
>>> # A string representing this instance of hbase. $USER by default.
>>> # export HBASE_IDENT_STRING=$USER
>>>
>>> # The scheduling priority for daemon processes.  See 'man nice'.
>>> # export HBASE_NICENESS=10
>>>
>>> # The directory where pid files are stored. /tmp by default.
>>> # export HBASE_PID_DIR=/var/hadoop/pids
>>>
>>> # Seconds to sleep between slave commands.  Unset by default.  This
>>> # can be useful in large clusters, where, e.g., slave rsyncs can
>>> # otherwise arrive faster than the master can service them.
>>> # export HBASE_SLAVE_SLEEP=0.1
>>>
>>> # Tell HBase whether it should manage it's own instance of Zookeeper or
>>> not.
>>> # export HBASE_MANAGES_ZK=true
>>>
>>> # The default log rolling policy is RFA, where the log file is rolled as
>>> per the size defined for the
>>> # RFA appender. Please refer to the log4j.properties file to see more
>>> details on this appender.
>>> # In case one needs to do log rolling on a date change, one should set
>>> the environment property
>>> # HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA".
>>> # For example:
>>> # HBASE_ROOT_LOGGER=INFO,DRFA
>>> # The reason for changing default to RFA is to avoid the boundary case
>>> of filling out disk space as
>>> # DRFA doesn't put any cap on the log size. Please refer to HBase-5655
>>> for more context.
>>> export HBASE_LOG_DIR=/var/log/hbase
>>> export HBASE_PID_DIR=/var/run/hbase
>>> export JAVA_HOME=/usr/lib/jvm/java-8-oracle
>>>
>>> The server still has plenty of ram available (1GB).
>>>
>>> It's not clear what is causing this as the logs are pretty sparse. Have
>>> any of you seen a problem like this before?
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message