hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: HBase Random Read latency > 100ms
Date Wed, 09 Oct 2013 05:29:57 GMT
Good call.
Could try to enable hbase.regionserver.checksum.verify, which will cause HBase to do its own
checksums rather than relying on HDFS (and which saves 1 IO per block get).

I do think you can expect the index blocks to be cached at all times.

-- Lars
________________________________
From: Vladimir Rodionov <vrodionov@carrieriq.com>
To: "user@hbase.apache.org" <user@hbase.apache.org> 
Sent: Tuesday, October 8, 2013 8:44 PM
Subject: RE: HBase Random Read latency > 100ms


Upd.

Each HBase Get = 2 HDFS read IO (index block + data block)= 4 File IO  (data + .crc) in a
worst case. I think if Bloom Filter is enabled than
it is going to be 6 File IO in a worst case (large data set), therefore you will have not
5 IO requests in queue but up to 20-30 IO requests in a queue
This definitely explains > 100ms avg latency.



Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________

From: Vladimir Rodionov
Sent: Tuesday, October 08, 2013 7:24 PM
To: user@hbase.apache.org
Subject: RE: HBase Random Read latency > 100ms

Ramu,

You have 8 server boxes and 10 client. You have 40 requests in parallel - 5 per RS/DN?

You have 5 requests on random reads in a IO queue of your single RAID1. With avg read latency
of 10 ms, 5 requests in queue will give us 30ms. Add some overhead
of HDFS + HBase and you will probably have your issue explained ?

Your bottleneck is your disk system, I think. When you serve most of requests from disks as
in your large data set scenario, make sure you have adequate disk sub-system and
that it is configured properly. Block Cache and OS page can not help you in this case as working
data set is larger than both caches.

Good performance numbers in small data set scenario are explained by the fact that data fits
into OS page cache and Block Cache - you do not read data from disk even if
you disable block cache.


Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Ramu M S [ramu.malur@gmail.com]
Sent: Tuesday, October 08, 2013 6:00 PM
To: user@hbase.apache.org
Subject: Re: HBase Random Read latency > 100ms

Hi All,

After few suggestions from the mails earlier I changed the following,

1. Heap Size to 16 GB
2. Block Size to 16KB
3. HFile size to 8 GB (Table now has 256 regions, 32 per server)
4. Data Locality Index is 100 in all RS

I have clients running in 10 machines, each with 4 threads. So total 40.
This is same in all tests.

Result:
           1. Average latency is still >100ms.
           2. Heap occupancy is around 2-2.5 GB in all RS

Few more tests carried out yesterday,

TEST 1: Small data set (100 Million records, each with 724 bytes).
===========================================
Configurations:
1. Heap Size to 1 GB
2. Block Size to 16KB
3. HFile size to 1 GB (Table now has 128 regions, 16 per server)
4. Data Locality Index is 100 in all RS

I disabled Block Cache on the table, to make sure I read everything from
disk, most of the time.

Result:
   1. Average Latency is 8ms and throughput went up to 6K/Sec per RS.
   2. With Block Cache enabled again, I got average latency around 2ms
and throughput of 10K/Sec per RS.
       Heap occupancy around 650 MB
   3. Increased the Heap to 16GB, with Block Cache still enabled, I got
average latency around 1 ms and throughput 20K/Sec per RS
       Heap Occupancy around 2-2.5 GB in all RS

TEST 2: Large Data set (1.8 Billion records, each with 724 bytes)
==================================================
Configurations:
1. Heap Size to 1 GB
2. Block Size to 16KB
3. HFile size to 1 GB (Table now has 2048 regions, 256 per server)
4. Data Locality Index is 100 in all RS

Result:
  1. Average Latency is > 500ms to start with and gradually decreases, but
even after around 100 Million reads it is still >100 ms
  2. Block Cache = TRUE/FALSE does not make any difference here. Even Heap
Size (1GB / 16GB) does not make any difference.
  3. Heap occupancy is around 2-2.5 GB under 16GB Heap and around 650 MB
under 1GB Heap.

GC Time in all of the scenarios is around 2ms/Second, as shown in the
Cloudera Manager.

Reading most of the items from Disk in less data scenario gives better
results and very low latencies.

Number of regions per RS and HFile size does make a huge difference in my
Cluster.
Keeping 100 Regions per RS as max(Most of the discussions suggest this),
should I restrict the HFile size to 1GB? and thus reducing the storage
capacity (From 700 GB to 100GB per RS)?

Please advice.

Thanks,
Ramu


On Wed, Oct 9, 2013 at 4:58 AM, Vladimir Rodionov
<vrodionov@carrieriq.com>wrote:

> What are your current heap and block cache sizes?
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Ramu M S [ramu.malur@gmail.com]
> Sent: Monday, October 07, 2013 10:55 PM
> To: user@hbase.apache.org
> Subject: Re: HBase Random Read latency > 100ms
>
> Hi All,
>
> Average Latency is still around 80ms.
> I have done the following,
>
> 1. Enabled Snappy Compression
> 2. Reduce the HFile size to 8 GB
>
> Should I attribute these results to bad Disk Configuration OR anything else
> to investigate?
>
> - Ramu
>
>
> On Tue, Oct 8, 2013 at 10:56 AM, Ramu M S <ramu.malur@gmail.com> wrote:
>
> > Vladimir,
> >
> > Thanks for the Insights into Future Caching features. Looks very
> > interesting.
> >
> > - Ramu
> >
> >
> > On Tue, Oct 8, 2013 at 10:45 AM, Vladimir Rodionov <
> > vrodionov@carrieriq.com> wrote:
> >
> >> Ramu,
> >>
> >> If your working set of data fits into 192GB you may get additional boost
> >> by utilizing OS page cache, or wait until
> >> 0.98 release which introduces new bucket cache implementation (port of
> >> Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not
> released
> >> yet
> >> but is due soon). Both caches stores data off-heap, but Facebook version
> >> can store encoded and compressed data and vanilla bucket cache does not.
> >> There are some options how to utilize efficiently available RAM (at
> least
> >> in upcoming HBase releases)
> >> . If your data set does not fit RAM then your only hope is your 24 SAS
> >> drives. Depending on your RAID settings, disk IO perf, HDFS
> configuration
> >> (I think the latest Hadoop is preferable here).
> >>
> >> OS page cache is most vulnerable and volatile, it can not be controlled
> >> and can be easily polluted by either some other processes or by HBase
> >> itself (long scan).
> >> With Block cache you have more control but the first truly usable
> >> *official* implementation is going to be a part of 0.98 release.
> >>
> >> As far as I understand, your use case would definitely covered by
> >> something similar to BigTable ScanCache (RowCache) , but there is no
> such
> >> cache in HBase yet.
> >> One major advantage of RowCache vs BlockCache (apart from being much
> more
> >> efficient in RAM usage) is resilience to Region compactions. Each minor
> >> Region compaction invalidates partially
> >> Region's data in BlockCache and major compaction invalidates this
> >> Region's data completely. This is not the case with RowCache (would it
> be
> >> implemented).
> >>
> >> Best regards,
> >> Vladimir Rodionov
> >> Principal Platform Engineer
> >> Carrier IQ, www.carrieriq.com
> >> e-mail: vrodionov@carrieriq.com
> >>
> >> ________________________________________
> >> From: Ramu M S [ramu.malur@gmail.com]
> >> Sent: Monday, October 07, 2013 5:25 PM
> >> To: user@hbase.apache.org
> >> Subject: Re: HBase Random Read latency > 100ms
> >>
> >> Vladimir,
> >>
> >> Yes. I am fully aware of the HDD limitation and wrong configurations wrt
> >> RAID.
> >> Unfortunately, the hardware is leased from others for this work and I
> >> wasn't consulted to decide the h/w specification for the tests that I am
> >> doing now. Even the RAID cannot be turned off or set to RAID-0
> >>
> >> Production system is according to the Hadoop needs (100 Nodes with 16
> Core
> >> CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely turned
> >> off, so we are creating 1 Virtual Disk containing only 1 Physical Disk
> and
> >> the VD RAID level set to* *RAID-0). These systems are still not
> >> available. If
> >> you have any suggestion on the production setup, I will be glad to hear.
> >>
> >> Also, as pointed out earlier, we are planning to use HBase also as an in
> >> memory KV store to access the latest data.
> >> That's why RAM was considered huge in this configuration. But looks like
> >> we
> >> would run into more problems than any gains from this.
> >>
> >> Keeping that aside, I was trying to get the maximum out of the current
> >> cluster or as you said Is 500-1000 OPS the max I could get out of this
> >> setup?
> >>
> >> Regards,
> >> Ramu
> >>
> >>
> >>
> >> Confidentiality Notice:  The information contained in this message,
> >> including any attachments hereto, may be confidential and is intended
> to be
> >> read only by the individual or entity to whom this message is
> addressed. If
> >> the reader of this message is not the intended recipient or an agent or
> >> designee of the intended recipient, please note that any review, use,
> >> disclosure or distribution of this message or its attachments, in any
> form,
> >> is strictly prohibited.  If you have received this message in error,
> please
> >> immediately notify the sender and/or Notifications@carrieriq.com and
> >> delete or destroy any copy of this message and its attachments.
> >>
> >
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or Notifications@carrieriq.com and
> delete or destroy any copy of this message and its attachments.
>

Confidentiality Notice:  The information contained in this message, including any attachments
hereto, may be confidential and is intended to be read only by the individual or entity to
whom this message is addressed. If the reader of this message is not the intended recipient
or an agent or designee of the intended recipient, please note that any review, use, disclosure
or distribution of this message or its attachments, in any form, is strictly prohibited. 
If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com
and delete or destroy any copy of this message and its attachments.

Mime
View raw message