hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dima Spivak <dimaspi...@apache.org>
Subject Re: Hbase cluster not getting UP one Region server get down
Date Thu, 20 Oct 2016 22:37:22 GMT
It can be lots of things, Manjeet. You've gotta do a bit of troubleshooting
yourself first; a long dump of your machine specs doesn't change that.

Can you describe what happened before/after the node went down? The log
just says server isn't running, so we can't tell much from that alone.

-Dima

On Wed, Oct 19, 2016 at 10:53 PM, Manjeet Singh <manjeet.chandhok@gmail.com>
wrote:

> I want to add few more points
>
>
> below is my cluster configuration
>
>
>
>
>
>
>
>
>
> *Distribution*
>
> * Total*
>
> *Distribution*
>
> *OS (RAID-1)*
>
> *DATA*
>
> *Total RAM*
>
> *Components*
>
> *Yarn Resource manager/ Node manager*
>
> *Node*
>
> Node- 1
>
> 2x6 Core
>
> 12 core
>
> 6x300 GB
>
> 300
>
> Single 900 GB RAID-10
>
> 96
>
> Hbase Master, HDFS Name Node,  Zookeeper Server, Spark History server,
> phoenix, HDFS Balancer, Spark getway. MySql.
>
> ·         YARN (MR2 Included) JobHistory Server
> <http://192.168.129.121:7180/cmf/services/10/instances/25/status>.
>
> ·         ResourceManager
> <http://192.168.129.121:7180/cmf/services/10/instances/26/status>
>
> Name Node
>
> Node- 2
>
> 2x6 Core
>
> 12 core
>
> 6x300 GB
>
> 300
>
> 300 GB X 6 Individual RAID-0
>
> 80
>
> Hdfs data node, Hbase Region, Zookeeper Server,  spark, Hbase Master,
>
> YARN (MR2 Included) NodeManager
>
> Data Node, Spark Node
>
> Node- 3
>
> 2x6 Core
>
> 12 core
>
> 6x300 GB
>
> 300
>
> 300 GB X 6 Individual RAID-0
>
> 80
>
> Hdfs data node, Hbase Region, Zookeeper Server,  spark
>
> YARN (MR2 Included) NodeManager
>
> Data Node, Spark Node
>
> Node - 4
>
> 2x6 Core
>
> 12 core
>
> 8x300 GB
>
> 300
>
> 300 GB X 6 Individual RAID-0
>
> 80
>
> Hdfs data node, Hbase Region, spark
>
> YARN (MR2 Included) NodeManager
>
> Data Node, Spark Node
>
>
>
>
>
>
>
>
> I noticed that Hbase taking more time while reading so i use below property
> to improve its performance
>
> *Property Name*
>
> *Original value*
>
> *Changed value*
>
> hfile.block.cache.size
>
> 0.4
>
> 0.6
>
> hbase.regionserver.global.memstore.size
>
> 0.4
>
> 0.2
>
>
> below is some more information
>
> I have Spark ETL jobon same cluster and I have below parameters after
> running this job
>
>
>
> *Parameter *
>
> *Value*
>
> Number of Pipeline
>
> 2 (Kafka)
>
> Raw Size of Kafka Message
>
> 21 GB
>
> Data Rate
>
> 1 MB/Sec per pipeline
>
> Size of Aggregated Data in Hbase
>
> 2.6 GB With Snappy and Major Compaction
>
> Batch Duration
>
> 30 sec
>
> Sliding Window , Window Duration
>
> 900 Sec [15 Minute]
>
> CPU Utilization
>
> 63.2 %
>
> Number of Executor
>
> 3 per  pipeline
>
> Allocated RAM
>
> 3 GB per  pipeline
>
> Cluster N/W IO
>
> 3.2 MB/sec
>
> Cluster Disk IO
>
> 3.5 MB/Sec
>
> Max Time(highest peak) taken by Spark ETL  for 900 MB Size of Data to
> Process data for Domain
>
> 2 Hour
>
> Max Time(highest peak)  taken by Spark ETL  for 900 MB Size of Data to
> Process data for Application
>
> 30 Minute
>
> Total Time Taken by kafka Simulator to push the data into Kafka
>
> 6h
>
> Total Time Taken by by Spark ETL to process all the Data
>
> 7 h
>
> Number of SQL Query
>
> 10
>
> Number of Profile
>
> 9
>
> Number of Row in Hbase
>
> 11015719
>
>
> Thanks
> Manjeet
>
>
> On Thu, Oct 20, 2016 at 10:45 AM, Manjeet Singh <
> manjeet.chandhok@gmail.com>
> wrote:
>
> > Hi All
> > Can any one help me to figure out the root cause I have 4 node cluster
> and
> > one data node get down , I did not understand why my Hbase Master not
> able
> > to get up
> >
> > I have belo log
> >
> > ERROR: org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server
> > is not running yet
> >         at org.apache.hadoop.hbase.master.HMaster.
> > checkServiceStarted(HMaster.java:2296)
> >         at org.apache.hadoop.hbase.master.MasterRpcServices.
> > isMasterRunning(MasterRpcServices.java:936)
> >         at org.apache.hadoop.hbase.protobuf.generated.
> > MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55654)
> >         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:
> 2170)
> >         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.
> java:109)
> >         at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(
> > RpcExecutor.java:133)
> >         at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.
> > java:108)
> >         at java.lang.Thread.run(Thread.java:745)
> >
> >
> > Thanks
> > Manjeet
> >
> > --
> > luv all
> >
>
>
>
> --
> luv all
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message