hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Tarnas <...@email.com>
Subject Re: Blob storage
Date Tue, 08 Mar 2011 19:25:03 GMT
Yes, HBASE-3483 fixed the majority of our pauses, but not all - as JD points out we do experience
issues related to inserting into several column families. Luckily inserts that have the really
imbalanced column family sizes (mb vs kb) are few and far between, relatively speaking. We
are also "throttled" by going through thrift, but even then I can push our 10 node cluster
to over 200k requests a second.

On Mar 8, 2011, at 11:16 AM, Ryan Rawson wrote:

> Probably the soft limit flushes, eh?
> On Mar 8, 2011 11:15 AM, "Jean-Daniel Cryans" <jdcryans@apache.org> wrote:
>> On Tue, Mar 8, 2011 at 11:04 AM, Chris Tarnas <cft@email.com> wrote:
>>> Just as a point of reference, in one of our systems we have 500+million
> rows that have a cell in its own column family that is about usually about
> 100bytes, but in about 10,000 of rows the cell can get to 300mb (average is
> probably about 30mb for the larger data). The jumbo sized data gets loaded
> in separately from the smaller data, although it all goes through the same
> pipeline. We are using cdh3b45 (0.90.1) GZ compression, region size of 1GB
> and with a max value size of 500mb. So far we have had no problems with the
> larger values.
>>> Our largest problem was performance related to inserting into several
> column families for the small sized value loads and pauses when flushing the
> memstores. 0.90.1 helped quite a bit with that.
>> Flushing is done without blocking, were the pauses you were seeing
>> related to the "too many stores" issue or about the global memstore
>> size?
>> In general inserting into many families is a bad idea unless the sizes
>> are the same. The worst case is inserting a few kbs in one and a few
>> mbs in the other. The reason being:
>> https://issues.apache.org/jira/browse/HBASE-3149
>> J-D

View raw message