hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Hbase Architecture Questions
Date Sat, 04 Feb 2017 00:16:53 GMT
bq. mapping job that does scanning reads

I assume proper time range (possibly with Filter's) is specified on the
Scan object so that only data necessary for the job is fetched.

Constantly creating / disabling / dropping tables is not amenable to the
cluster.

See if you can reuse table whose data has been scanned. To drop the data
before next round of writes come in, you can set TTL for the table properly.

Cheers

On Fri, Feb 3, 2017 at 1:46 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> bq. We use Hbase 1.0.0
>
> 1.0.0 was quite old.
>
> Can you try more recent releases such as 1.3.0 (the hbase-thrift module
> should be more robust) ?
>
> If your nodes have enough memory, have you thought of using bucket cache
> to improve read performance ?
>
> Cheers
>
> On Fri, Feb 3, 2017 at 1:34 PM, Akshat Mahajan <amahajan@brightedge.com>
> wrote:
>
>> > By "After Hadoop runs", you mean a batch collection/processing job?
>> MapReduce? Hadoop is a collection of distributed processing tools:
>> Filesystem (HDFS) and distributed execution (YARN)...
>>
>> Yes, we mean a batch collection/processing job. We use Yarn and HDFS, and
>> we employ the Hadoop API to run only mappers (we do not need to perform
>> reductions on our data) in Java. But the collection happens through Python,
>> necessitating the use of Thrift, and the actual processing happens through
>> Yarn.
>>
>> > If you are deleting the table in a short period of time, then, yes,
>> disabling major compactions is probably "OK". However, not running major
>> compactions will have read performance implications. While there are
>> tombstones, these represent extra work that your regionservers must perform.
>>
>> Can you please clarify? Our understanding is that a table deletion will
>> close the entire region corresponding to that table across all
>> RegionServers. If that is correct, why should there be a read performance
>> issue? (I'm assuming that closed regions are never accessed again by the
>> regionservers - am I correct?).
>>
>> > It sounds like you could just adopt a bulk-loading approach instead of
>> doing live writes into HBase..
>>
>> This is certainly a possibility, but would require a fairly sizable
>> rewrite for our application.
>>
>> >  I doubt REST is ever going to be a "well-performing" tool. You're
>> likely chasing your tail to try to optimize this. Your time would be better
>> spent using the HBase Java API directly.
>>
>> We are constrained by having to use Python, so we can't use the native
>> API. We switched from Thrift to REST when we found Thrift kept dying under
>> the load we put it under.
>>
>> > Are you taxing your machines' physical resources? What does I/O, CPU
>> and memory utilization look like?
>>
>> Let me get back to you on this front more fully in a followup.
>>
>> Currently providing estimates for our bulkier and more problematic
>> cluster:
>>
>> We are not constrained by memory - our free memory utilisation on all the
>> regionservers is close to 99%, but about 40% of that in each RS is used in
>> caches that will be readily given to any programs that require it by the
>> kernel (as assessed by `free`). On the master node, it is closer to 60%
>> memory utilisation.
>>
>> Our CPU utilisation varies, but under regular operation, user time is at
>> 60 - 40 percent on all regionservers. CPU idle time is very high, usually,
>> about 90%, and CPU system time is about 5%.
>>
>> I/O wait time is very, very low - about 0.23% on average.
>>
>> > Yeah I'm not sure why you are doing this. There's no reason I can think
>> of as to why you would need to do this...
>>
>> Believe it or not, it is the only thing we have found that helps us
>> restore performance temporarily.
>>
>> It's not ideal, and we don't want to keep doing it, though.
>>
>> Akshat
>>
>>
>> -----Original Message-----
>> From: Josh Elser [mailto:elserj@apache.org]
>> Sent: Friday, February 03, 2017 1:05 PM
>> To: user@hbase.apache.org
>> Subject: Re: Hbase Architecture Questions
>>
>>
>>
>> Akshat Mahajan wrote:
>>
>> > b) Every table in our Hbase is associated with a unique collection of
>> items, and all tables exhibit the same column families (2 column families).
>> After Hadoop runs, we no longer require the tables, so we delete them.
>> Individual rows are never removed; instead, entire tables are removed at a
>> time.
>>
>> By "After Hadoop runs", you mean a batch collection/processing job?
>> MapReduce? Hadoop is a collection of distributed processing tools:
>> Filesystem (HDFS) and distributed execution (YARN)...
>>
>> > a) _We have turned major compaction off_.
>> >
>> > We are aware this is against recommended advice. Our reasoning for
>> > this is that
>> >
>> > 1) periods of major compaction degrade both our read and write
>> > performance heavily (to the point our schedule is delayed beyond
>> > tolerance), and
>> > 2) all our tables are temporary - we do not intend to keep them around,
>> and disabling/deleting old tables closes entire regions altogether and
>> should have the same effect as major compaction processing tombstone
>> markers on rows. Read performance should then theoretically not be impacted
>> - we expect that the RegionServer will never even consult that region in
>> doing reads, so storefile buildup overall should not be an issue.
>>
>> If you are deleting the table in a short period of time, then, yes,
>> disabling major compactions is probably "OK".
>>
>> However, not running major compactions will have read performance
>> implications. While there are tombstones, these represent extra work that
>> your regionservers must perform.
>>
>> >
>> > b) _We have made additional efforts to turn off minor compactions as
>> much as possible_.
>> >
>> > In particular, our hbase.hstore.compaction.max.size is set to 500 MB,
>> our hbase.hstore.compactionThreshold is set to 1000 bytes. We do this in
>> order to prevent a minor compaction from becoming a major compaction -
>> since we cannot prevent that, we were forced to try and prevent minor
>> compactions running at all.
>>
>> It sounds like you could just adopt a bulk-loading approach instead of
>> doing live writes into HBase..
>>
>> > c)  We have tried to make REST more performant by improving the number
>> of REST threads to about 9000.
>> >
>> > This figure is derived from counting the number of connections on REST
>> during periods of high write load.
>>
>> I doubt REST is ever going to be a "well-performing" tool. You're likely
>> chasing your tail to try to optimize this. Your time would be better
>> spent using the HBase Java API directly.
>>
>> > d) We have turned on bloom filters, use an LRUBlockCache which caches
>> data only on reads, and have set tcpnodelay to true. These were in place
>> before we turned major compaction off.
>> >
>> > Our observations with these settings in performance:
>> >
>> > a) We are seeing an impact on both read/write performance correlated
>> strongly with store file buildup. Our store files number between 500 to
>> 1500 on each RS - the total size on each RegionServer are on the order 100
>> to 200 GBs at worst.
>> > b) As number of connections on Hbase REST rises, write performance is
>> impacted. We originally believed this was due to high frequency of memstore
>> flushes - but increasing the memstore buffer sizes has had no discernible
>> impact on read/write. Currently, our callqueue.handler.size is set to 100 -
>> since we experience over 100 requests/second on each RS, we are considering
>> increasing this to about 300 so we can handle more requests concurrently.
>> Is this a good change, or are other changes needed as well?
>>
>> Are you taxing your machines' physical resources? What does I/O, CPU and
>> memory utilization look like?
>>
>> > Unfortunately, we cannot provide raw metrics on the magnitude of
>> read/write performance degradation as we do not have sufficient tracking
>> for them. A rough proxy - we do know our clusters are capable of processing
>> 200 jobs in an hour. This now goes down to as low as 30-50 jobs per hour
>> with minimal changes to the jobs themselves. We wish to be able to get back
>> to our original performance.
>> >
>> > For now, in periods of high stress (large jobs or quick reads/writes),
>> we are manually clearing out the hbase folder in HDFS (including store
>> files, WALs, oldWALs and archived files), and resetting our clusters to an
>> empty state. We are aware this is not ideal, and are looking for ways to
>> not have to do this. Our understanding of how Hbase works is probably
>> imperfect, and we would appreciate any advice or feedback in this regard.
>>
>> Yeah I'm not sure why you are doing this. There's no reason I can think
>> of as to why you would need to do this...
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message