hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akshat Mahajan <amaha...@brightedge.com>
Subject Hbase Architecture Questions
Date Fri, 03 Feb 2017 20:27:17 GMT
Hello, all,

We would like advice on our current Hbase configuration and setup.

It differs from the standard advice because we believe our use case is sufficiently unique
to justify it, and our attempts to optimise its performance have not had much success. We
want to 1) check if our understanding is correct, and 2) receive feedback on how to improve
our read/write performance.

Our use-case is as follows:

a) We use Hbase 1.0.0. to store high-volume data (approximately 0.5 to 1 TB) on which we perform
lone Hadoop (CDH 5.5) mapping jobs (with no reduce component) that does scanning reads. This
batch collection and processing runs weekly over a period of two to three days with no pauses.
We have two clusters, each set up as 1 master with 3 regionserver nodes, that are independent
of each other. They are fairly high-end machines in terms of disk, memory and processors.

b) Every table in our Hbase is associated with a unique collection of items, and all tables
exhibit the same column families (2 column families). After Hadoop runs, we no longer require
the tables, so we delete them. Individual rows are never removed; instead, entire tables are
removed at a time.

c) Typically, these tables are written to very quickly (anywhere between 100 to 200 requests
per second on each regionserver is normal). They are also deleted very frequently. Reads and
writes happen concurrently - it is not unusual for simultaneous high read counts and high
write counts to occur. Their frequencies are about 20 tables are created, populated and deleted
within the space of an hour. An individual table may be about 1 to 10 GBs in size.

d) Finally, all reads are carried out by multiple Hadoop map jobs through the native Hbase
interface. All writes, however, are carried out through Hbase REST by Python scripts which
collect our data for us. A read for an individual table never coincides with a write to the
same table - reads and writes can both happen, but never on the same table at the same time.

Our current design decisions are as follows:

a) _We have turned major compaction off_.

We are aware this is against recommended advice. Our reasoning for this is that

1) periods of major compaction degrade both our read and write performance heavily (to the
point our schedule is delayed beyond tolerance), and
2) all our tables are temporary - we do not intend to keep them around, and disabling/deleting
old tables closes entire regions altogether and should have the same effect as major compaction
processing tombstone markers on rows. Read performance should then theoretically not be impacted
- we expect that the RegionServer will never even consult that region in doing reads, so storefile
buildup overall should not be an issue.

This last point is based on prior understanding with Cassandra - we have not been able to
find an adequate source for this in Hbase, so we may be incorrect. Currently, we are proceeding
with this understanding.

b) _We have made additional efforts to turn off minor compactions as much as possible_.

In particular, our hbase.hstore.compaction.max.size is set to 500 MB, our hbase.hstore.compactionThreshold
is set to 1000 bytes. We do this in order to prevent a minor compaction from becoming a major
compaction - since we cannot prevent that, we were forced to try and prevent minor compactions
running at all.

c)  We have tried to make REST more performant by improving the number of REST threads to
about 9000.

This figure is derived from counting the number of connections on REST during periods of high
write load.

d) We have turned on bloom filters, use an LRUBlockCache which caches data only on reads,
and have set tcpnodelay to true. These were in place before we turned major compaction off.

Our observations with these settings in performance:

a) We are seeing an impact on both read/write performance correlated strongly with store file
buildup. Our store files number between 500 to 1500 on each RS - the total size on each RegionServer
are on the order 100 to 200 GBs at worst.
b) As number of connections on Hbase REST rises, write performance is impacted. We originally
believed this was due to high frequency of memstore flushes - but increasing the memstore
buffer sizes has had no discernible impact on read/write. Currently, our callqueue.handler.size
is set to 100 - since we experience over 100 requests/second on each RS, we are considering
increasing this to about 300 so we can handle more requests concurrently. Is this a good change,
or are other changes needed as well?

Unfortunately, we cannot provide raw metrics on the magnitude of read/write performance degradation
as we do not have sufficient tracking for them. A rough proxy - we do know our clusters are
capable of processing 200 jobs in an hour. This now goes down to as low as 30-50 jobs per
hour with minimal changes to the jobs themselves. We wish to be able to get back to our original

For now, in periods of high stress (large jobs or quick reads/writes), we are manually clearing
out the hbase folder in HDFS (including store files, WALs, oldWALs and archived files), and
resetting our clusters to an empty state. We are aware this is not ideal, and are looking
for ways to not have to do this. Our understanding of how Hbase works is probably imperfect,
and we would appreciate any advice or feedback in this regard.

Akshat Mahajan

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message