hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Schäfer <syrious3...@yahoo.de>
Subject Re: 0.92 and Read/writes not scaling
Date Mon, 19 Mar 2012 12:21:47 GMT
referring to my experiences I expect the client to be the bottleneck, too.

So try to increase the count of client-machines (not client threads) each with its own unshared
network interface.

In my case I could double write throughput by doubling client machine count with a much smaller
system than yours (5 machines, 4gigs RAM each).

Good Luck
Chris



________________________________
 Von: Juhani Connolly <juhanic@gmail.com>
An: user@hbase.apache.org 
Gesendet: 13:02 Montag, 19.März 2012
Betreff: Re: 0.92 and Read/writes not scaling
 
I was concerned that may be the case too, which is why we ran the ycsb
tests in addition to our application specific and general performance
tests. checking profiles of the execution just showed the vast majority of
time spent waiting for responses. these were all run with 400
threads(though we tried more/less just in case)
2012/03/19 20:57 "Mingjian Deng" <koven2049@gmail.com>:

> @Juhani:
> How many clients did you test? Maybe the bottleneck was client?
>
> 2012/3/19 Ramkrishna.S.Vasudevan <ramkrishna.vasudevan@huawei.com>
>
> > Hi Juhani
> >
> > Can you tell more on how the regions are balanced?
> > Are you overloading only specific region server alone?
> >
> > Regards
> > Ram
> >
> > > -----Original Message-----
> > > From: Juhani Connolly [mailto:juhanic@gmail.com]
> > > Sent: Monday, March 19, 2012 4:11 PM
> > > To: user@hbase.apache.org
> > > Subject: 0.92 and Read/writes not scaling
> > >
> > > Hi,
> > >
> > > We're running into a brick wall where our throughput numbers will not
> > > scale as we increase server counts both using custom inhouse tests and
> > > ycsb.
> > >
> > > We're using hbase 0.92 on hadoop 0.20.2(we also experience the same
> > > issues using 0.90 before switching our testing to  this version).
> > >
> > > Our cluster consists of:
> > > - Namenode and hmaster on separate servers, 24 core, 64gb
> > > - up to 11 datanode/regionservers. 24 core, 64gb, 4 * 1tb disks(hope
> > > to get this changed)
> > >
> > > We have adjusted our gc settings, and mslabs:
> > >
> > >   <property>
> > >     <name>hbase.hregion.memstore.mslab.enabled</name>
> > >     <value>true</value>
> > >   </property>
> > >
> > >   <property>
> > >     <name>hbase.hregion.memstore.mslab.chunksize</name>
> > >     <value>2097152</value>
> > >   </property>
> > >
> > >   <property>
> > >     <name>hbase.hregion.memstore.mslab.max.allocation</name>
> > >     <value>1024768</value>
> > >   </property>
> > >
> > > hdfs xceivers is set to 8192
> > >
> > > We've experimented with a variety of handler counts for namenode,
> > > datanodes and regionservers with no changes in throughput.
> > >
> > > For testing with ycsb, we do the following each time(with nothing else
> > > using the cluster):
> > > - truncate test table
> > > - add a small amount of data, then split the table into 32 regions and
> > > call balancer from the shell.
> > > - load 10m rows
> > > - do a 1:2:7 insert:update:read test with 10million rows (64k/sec)
> > > - do a 5:5 insert:update test with 10 million rows (23k/sec)
> > > - do a pure read test with 10 million rows (75k/sec)
> > >
> > > We have observed ganglia, iostat -d -x, iptraf, top, dstat and a
> > > variety of other diagnostic tools and network/io/cpu/memory as
> > > bottlenecks seem highly unlikely as none of them are  ever seriously
> > > taxed. This leave me to assume this is some kind of locking issue?
> > > Delaying WAL flushes gives a small throughput bump but it doesn't
> > > scale.
> > >
> > > There also doesn't seem to be many figures around to compare ours to.
> > > We can get our throughput numbers higher with tricks like not writing
> > > the WAL or delaying flushes, batching requests, but nothing seems to
> > > scale with additional slaves.
> > > Could anyone provide guidance as to what may be preventing throughput
> > > figures from scaling as we increase our slave count?
> >
> >
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message