hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric K <codemonke...@gmail.com>
Subject Re: HBase balancer performance
Date Fri, 29 Aug 2014 20:56:18 GMT
Thanks Esteban and others for your help on this, I'm still having issues
after digging into this further, so let me provide more info.  It's a bit
of a mystery.

I am using the YCSB Workload C to achieve 100% reads, however I had changed
the request distribution from zipfian to uniform.  I changed it back to
zipfian, though, using the original Workload C, with the same poor results
-- it takes about 45 minutes after doubling the hbase nodes from 4 to 8
before YCSB performance goes back to normal.  I expect a small drop in
performance due to relocation of data that is no longer cached, but not
this bad:  700 ops/sec over the 45 minute period after adding the nodes,
compared to 7000 ops/sec once things get better.

I double checked and the same poor performance happens if I add new nodes
but wait 15 minutes before actually starting the workload.  If I wait an
extended 45 minutes before starting the workload, then the time of poor
performance is shortened a lot (~15 minutes).  Next I tried waiting an hour
and there was no poor performance, so something long running appears to be
happening within HBase/HDFS or left over as a result of adding more nodes
to balance over.  But top and iostat don't really show any activity during
that extended 45 minute period (after the 5 minutes of actual balancing).
 Is there anything else helpful that I could check?

My HBase heap size is set to 5120 MB in hbase-env.sh.  I'm also running
with HBASE_OPTS="-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:NewRatio=6
-XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly
-XX:MaxGCPauseMillis=100".

I'm not sure how useful this was, but I flush the user table and do a major
compact after loading in the data, but before starting the actual read
workload and adding the 4 extra nodes.

When I'm not balancing onto more nodes, but just running a secondary
workload of read requests for a different table over the same HBase
servers, the read requests for the primary user table will suffer as
expected, and the performance will jump back up to normal after the
secondary read requests finish, also as expected.  So why isn't life good
like this when invoking the balancer?

Thanks,
Eric





On Fri, Aug 22, 2014 at 8:18 PM, Esteban Gutierrez <esteban@cloudera.com>
wrote:

> Hello Eric,
>
> Can you tell us more about the YCSB workload? for instance for workload b
> you should expect a drop in performance after new RSs are added since data
> locality will be impacted by the region re-assignment. If you major compact
> the tables or the regions with poor locality you should see an improvement,
> specially if you are running workload b or any workload with is high on
> reads. However you shouldn't see a huge drop in performance if RS and DNs
> are correctly configured and there are no networking issues.
>
> cheers,
> esteban.
>
>
>
>
>
> --
> Cloudera, Inc.
>
>
>
> On Fri, Aug 22, 2014 at 4:26 PM, Eric K <codemonkeeez@gmail.com> wrote:
>
> > I wish that explained it, but the last reassignment message happens
> around
> > T+1(min) and then it's just 44 minutes of "skipping because balanced."
> Is
> > there anything else that could be preventing HBase from serving data from
> > the newly reassigned nodes?
> >
> > Perhaps HDFS could also be the culprit.  Hadoop namenode logs show a lot
> of
> > this for T+5 minutes:
> >
> > 2014-08-15 16:45:21,169 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> > NameSystem.addStoredBlock: blockMap updated: 172.19.152.251:9002 is
> added
> > to blk_5924440019130413908_2033 size 14
> > 2014-08-15 16:45:21,183 INFO org.apache.hadoop.hdfs.StateChange: Removing
> > lease on  file /system/balancer.id from client DFSClient_-1186888991
> > 2014-08-15 16:45:21,183 INFO org.apache.hadoop.hdfs.StateChange: DIR*
> > NameSystem.completeFile: file /system/balancer.id is closed by
> > DFSClient_-1186888991
> > 2014-08-15 16:45:21,183 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> > transactions: 198 Total time for transactions(ms): 3Number of
> transactions
> > batched in Syncs: 0 Number of syncs: 9 SyncTimes(ms): 209
> > 2014-08-15 16:45:21,201 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> > NameSystem.addToInvalidates: blk_5924440019130413908 is added to
> invalidSet
> > of 172.19.152.251:9002
> > 2014-08-15 16:45:23,017 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> ask
> > 172.19.152.251:9002 to delete  blk_5924440019130413908_2033
> > 2014-08-15 16:46:10,991 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> > NameSystem.addStoredBlock: blockMap updated: 172.19.152.241:9002 is
> added
> > to blk_3473904193447805515_2028 size 311
> > 2014-08-15 16:46:10,995 INFO org.apache.hadoop.hdfs.StateChange: Removing
> > lease on  file /hbase/.logs/
> >
> > But then there is not much log activity afterwards for the next hour.
> >
> > NameSystem.processReport: from 172.19.149.160:9002, blocks: 109,
> > processing
> > time: 16 msecs
> > 2014-08-15 16:50:58,871 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
> > NameSystem.processReport: from 172.19.149.89:9002, blocks: 67,
> processing
> > time: 0 msecs
> > 2014-08-15 16:54:27,788 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
> > NameSystem.processReport: from 172.19.152.247:9002, blocks: 84,
> processing
> > time: 3 msecs
> > 2014-08-15 16:56:08,408 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
> > NameSystem.processReport: from 172.19.152.251:9002, blocks: 58,
> processing
> > time: 1 msecs
> > 2014-08-15 16:57:57,491 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
> > NameSystem.processReport: from 172.19.152.241:9002, blocks: 83,
> processing
> > time: 3 msecs
> > 2014-08-15 16:59:37,278 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from
> > 172.19.152.221
> > 2014-08-15 16:59:37,278 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> > transactions: 205 Total time for transactions(ms): 3Number of
> transactions
> > batched in Syncs: 0 Number of syncs: 15 SyncTimes(ms): 307
> > 2014-08-15 16:59:37,920 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll FSImage from
> > 172.19.152.221
> > 2014-08-15 16:59:37,920 INFO
> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of
> > transactions: 0 Total time for transactions(ms): 0Number of transactions
> > batched in Syncs: 0 Number of syncs: 1 SyncTimes(ms): 33
> > 2014-08-15 17:21:45,268 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
> > NameSystem.processReport: from 172.19.149.42:9002, blocks: 81,
> processing
> > time: 0 msecs
> > 2014-08-15 17:23:43,104 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
> > NameSystem.processReport: from 172.19.149.210:9002, blocks: 90,
> processing
> > time: 2 msecs
> > 2014-08-15 17:33:26,596 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
> > NameSystem.processReport: from 172.19.149.231:9002, blocks: 74,
> processing
> > time: 1 msecs
> >
> > Thanks,
> > Eric
> >
> >
> > On Fri, Aug 22, 2014 at 5:00 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > Was the last 'Successfully deleted unassigned node' log close to
> > T+45(min)
> > > ?
> > >
> > > If so, it means that AssignmentManager was busy assigning these regions
> > > during this period.
> > >
> > > Cheers
> > >
> > >
> > > On Fri, Aug 22, 2014 at 1:50 PM, Eric K <codemonkeeez@gmail.com>
> wrote:
> > >
> > > > After about a minute everything appears to have been moved, since the
> > > > master only prints out lines like:
> > > >
> > > > 2014-08-15 16:40:56,945 INFO
> > org.apache.hadoop.hbase.master.LoadBalancer:
> > > > Skipping load balancing because balanced cluster; servers=8
> regions=126
> > > > average=15.75 mostloaded=16 leastloaded=15
> > > >
> > > > Within the first minute, though, after adding the new nodes, I see a
> > lot
> > > of
> > > > activity like this:
> > > >
> > > > 2014-08-15 16:40:49,142 INFO org.apache.hadoop.hbase.master.HMaster:
> > > > balance
> > > >
> > > >
> > >
> >
> hri=usertable,user9673372036854775807,1408128925670.e3c3e82d52f54692a70e7b59aeb56b31.,
> > > > src=172-19-152-247,60020,1408131968632,
> > > > dest=172-19-149-231,60020,1408135240788
> > > > 2014-08-15 16:40:49,142 DEBUG
> > > > org.apache.hadoop.hbase.master.AssignmentManager: Starting
> unassignment
> > > of
> > > > region
> > > >
> > > >
> > >
> >
> usertable,user9673372036854775807,1408128925670.e3c3e82d52f54692a70e7b59aeb56b31.
> > > > (offlining)
> > > > 2014-08-15 16:40:49,142 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > > > master:60000-0x147db34a2c80000 Creating unassigned node for
> > > > e3c3e82d52f54692a70e7b59aeb56b31 in a CLOSING state
> > > > 2014-08-15 16:40:49,181 DEBUG
> > > org.apache.hadoop.hbase.master.ServerManager:
> > > > New connection to 172-19-152-247,60020,1408131968632
> > > > 2014-08-15 16:40:49,189 DEBUG
> > > > org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to
> > > > 172-19-152-247,60020,1408131968632 for region
> > > >
> > > >
> > >
> >
> usertable,user9673372036854775807,1408128925670.e3c3e82d52f54692a70e7b59aeb56b31.
> > > > 2014-08-15 16:40:49,189 INFO org.apache.hadoop.hbase.master.HMaster:
> > > > balance
> > > >
> > > >
> > >
> >
> hri=usertable,user1068528100307123131,1408128925659.1fe73c5f470b78e7e9d2a0691659967a.,
> > > > src=172-19-152-241,60020,1408131968630,
> > > > dest=172-19-149-89,60020,1408135238164
> > > > 2014-08-15 16:40:49,189 DEBUG
> > > > org.apache.hadoop.hbase.master.AssignmentManager: Starting
> unassignment
> > > of
> > > > region
> > > >
> > > >
> > >
> >
> usertable,user1068528100307123131,1408128925659.1fe73c5f470b78e7e9d2a0691659967a.
> > > > (offlining)
> > > > 2014-08-15 16:40:49,189 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > > > master:60000-0x147db34a2c80000 Creating unassigned node for
> > > > 1fe73c5f470b78e7e9d2a0691659967a in a CLOSING state
> > > > 2014-08-15 16:40:49,223 DEBUG
> > > org.apache.hadoop.hbase.master.ServerManager:
> > > > New connection to 172-19-152-241,60020,1408131968630
> > > > 2014-08-15 16:40:49,223 DEBUG
> > > > org.apache.hadoop.hbase.master.AssignmentManager: Handling
> > > > transition=RS_ZK_REGION_OPENED,
> > > server=172-19-149-231,60020,1408135240788,
> > > > region=7b08e35b1fb9f445d91a90ae89bb401a
> > > > 2014-08-15 16:40:49,224 DEBUG
> > > > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling
> > > OPENED
> > > > event for
> > > >
> > > >
> > >
> >
> othertable,user1342640501535615655,1408128955190.7b08e35b1fb9f445d91a90ae89bb401a.
> > > > from 172-19-149-231,60020,1408135240788; deleting unassigned node
> > > > 2014-08-15 16:40:49,224 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > > > master:60000-0x147db34a2c80000 Deleting existing unassigned node for
> > > > 7b08e35b1fb9f445d91a90ae89bb401a that is in expected state
> > > > RS_ZK_REGION_OPENED
> > > > 2014-08-15 16:40:49,225 DEBUG
> > > > org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to
> > > > 172-19-152-241,60020,1408131968630 for region
> > > >
> > > >
> > >
> >
> usertable,user1068528100307123131,1408128925659.1fe73c5f470b78e7e9d2a0691659967a.
> > > > 2014-08-15 16:40:49,225 INFO org.apache.hadoop.hbase.master.HMaster:
> > > > balance
> > > >
> > > >
> > >
> >
> hri=usertable,user9873372036854775807,1408128925670.3eeb1e68835cbb98c9f796d437fe7ebc.,
> > > > src=172-19-152-241,60020,1408131968630,
> > > > dest=172-19-149-89,60020,1408135238164
> > > > 2014-08-15 16:40:49,225 DEBUG
> > > > org.apache.hadoop.hbase.master.AssignmentManager: Starting
> unassignment
> > > of
> > > > region
> > > >
> > > >
> > >
> >
> usertable,user9873372036854775807,1408128925670.3eeb1e68835cbb98c9f796d437fe7ebc.
> > > > (offlining)
> > > > 2014-08-15 16:40:49,225 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > > > master:60000-0x147db34a2c80000 Creating unassigned node for
> > > > 3eeb1e68835cbb98c9f796d437fe7ebc in a CLOSING state
> > > > 2014-08-15 16:40:49,281 DEBUG
> > org.apache.hadoop.hbase.zookeeper.ZKAssign:
> > > > master:60000-0x147db34a2c80000 Successfully deleted unassigned node
> for
> > > > region 7b08e35b1fb9f445d91a90ae89bb401a in expected state
> > > > RS_ZK_REGION_OPENED
> > > >
> > > > Thanks,
> > > > Eric
> > > >
> > > >
> > > >
> > > > On Fri, Aug 22, 2014 at 2:24 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > >
> > > > > Suppose you add new nodes at time T, can you check master log
> > between T
> > > > and
> > > > > T+45(min) to see how long region movements took ?
> > > > >
> > > > > Cheers
> > > > >
> > > > >
> > > > > On Fri, Aug 22, 2014 at 11:19 AM, Eric K <codemonkeeez@gmail.com>
> > > wrote:
> > > > >
> > > > > > I'm experimenting with HBase (0.94.4) and Hadoop (1.0.4) runtime
> > > > > balancers
> > > > > > on just a very tiny 4 node cluster and finding that performance
> is
> > > bad
> > > > > for
> > > > > > an hour after adding new nodes, even though all the data is
> > > supposedly
> > > > > > offloaded within a few minutes.
> > > > > >
> > > > > > Using YCSB-generated requests, I load the database with about
18
> GB
> > > > worth
> > > > > > of data across 12 million records.  The keyspace is initially
> > > pre-split
> > > > > to
> > > > > > have ~30 regions per node with single replication of data. 
Then
> I
> > > hit
> > > > > > HBase with read requests from a set of clients so that there
are
> > 2000
> > > > > > requests outstanding, and new requests are immediately made
after
> > > > replies
> > > > > > are received.  While these requests are running, after about
2
> > > > minutes, I
> > > > > > double the nodes from 4 to 8, add the new node information to
the
> > > > slaves
> > > > > > and regionservers files, start up the new datanode and
> regionserver
> > > > > > processes, and call the hdfs balancer with the smallest possible
> > > > > threshold
> > > > > > of 1.  The hbase.balancer.period is also set to 10 seconds so
as
> to
> > > > > respond
> > > > > > fast to new nodes.  The dfs.bandwidth.bandwidthPerSec is set
to 8
> > > MB/s
> > > > > per
> > > > > > node, but I have also tried higher numbers that don't bottleneck
> > the
> > > > > > offload rate and gotten similar results.
> > > > > >
> > > > > > The expected response is that about half the data and regions
are
> > > > > offloaded
> > > > > > to the new nodes in the next few minutes, and the logs (and
> > hdfsadmin
> > > > > > reports) confirm that this is indeed happening.  However, I'm
> > seeing
> > > > the
> > > > > > throughput drop from 3000 to 500 requests/sec when the nodes
are
> > > added,
> > > > > > with latencies jumping from ~1s to 4 or 5 seconds, and this
poor
> > > > > > performance persists for almost 45 minutes, when it abruptly
gets
> > > > better
> > > > > to
> > > > > > 4000 requests/sec and 500ms latencies.
> > > > > >
> > > > > > I'd appreciate any ideas as to what could be causing that 45
> minute
> > > > > > performance delay, and how I can debug this further.  If I add
> the
> > > new
> > > > > > nodes and balance onto them before starting up any read requests,
> > > > > > performance is much better from the start.
> > > > > >
> > > > > > Thanks,
> > > > > > Eric
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message