hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vrodio...@carrieriq.com>
Subject RE: HBase Random Read latency > 100ms
Date Wed, 09 Oct 2013 17:59:56 GMT
I can't say for SCR. There is a possibility that the feature is broken, of course.
But the fact that hbase.regionserver.checksum.verify does not affect performance means that
OS caches
effectively HDFS checksum files.


Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Ramu M S [ramu.malur@gmail.com]
Sent: Wednesday, October 09, 2013 12:11 AM
To: user@hbase.apache.org; lars hofhansl
Subject: Re: HBase Random Read latency > 100ms

Hi All,

Sorry. There was some mistake in the tests (Clients were not reduced,
forgot to change the parameter before running tests).

With 8 Clients and,

SCR Enabled : Average Latency is 25 ms, IO Wait % is around 8
SCR Disabled: Average Latency is 10 ms, IO Wait % is around 2

Still, SCR disabled gives better results, which confuse me. Can anyone
clarify?

Also, I tried setting the parameter (hbase.regionserver.checksum.verify as
true) Lars suggested with SCR disabled.
Average Latency is around 9.8 ms, a fraction lesser.

Thanks
Ramu


On Wed, Oct 9, 2013 at 3:32 PM, Ramu M S <ramu.malur@gmail.com> wrote:

> Hi All,
>
> I just ran only 8 parallel clients,
>
> With SCR Enabled : Average Latency is 80 ms, IO Wait % is around 8
> With SCR Disabled: Average Latency is 40 ms, IO Wait % is around 2
>
> I always thought SCR enabled, allows a client co-located with the DataNode
> to read HDFS file blocks directly. This gives a performance boost to
> distributed clients that are aware of locality.
>
> Is my understanding wrong OR it doesn't apply to my scenario?
>
> Meanwhile I will try setting the parameter suggested by Lars and post you
> the results.
>
> Thanks,
> Ramu
>
>
> On Wed, Oct 9, 2013 at 2:29 PM, lars hofhansl <larsh@apache.org> wrote:
>
>> Good call.
>> Could try to enable hbase.regionserver.checksum.verify, which will cause
>> HBase to do its own checksums rather than relying on HDFS (and which saves
>> 1 IO per block get).
>>
>> I do think you can expect the index blocks to be cached at all times.
>>
>> -- Lars
>> ________________________________
>> From: Vladimir Rodionov <vrodionov@carrieriq.com>
>> To: "user@hbase.apache.org" <user@hbase.apache.org>
>> Sent: Tuesday, October 8, 2013 8:44 PM
>> Subject: RE: HBase Random Read latency > 100ms
>>
>>
>> Upd.
>>
>> Each HBase Get = 2 HDFS read IO (index block + data block)= 4 File IO
>> (data + .crc) in a worst case. I think if Bloom Filter is enabled than
>> it is going to be 6 File IO in a worst case (large data set), therefore
>> you will have not 5 IO requests in queue but up to 20-30 IO requests in a
>> queue
>> This definitely explains > 100ms avg latency.
>>
>>
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: vrodionov@carrieriq.com
>>
>> ________________________________________
>>
>> From: Vladimir Rodionov
>> Sent: Tuesday, October 08, 2013 7:24 PM
>> To: user@hbase.apache.org
>> Subject: RE: HBase Random Read latency > 100ms
>>
>> Ramu,
>>
>> You have 8 server boxes and 10 client. You have 40 requests in parallel -
>> 5 per RS/DN?
>>
>> You have 5 requests on random reads in a IO queue of your single RAID1.
>> With avg read latency of 10 ms, 5 requests in queue will give us 30ms. Add
>> some overhead
>> of HDFS + HBase and you will probably have your issue explained ?
>>
>> Your bottleneck is your disk system, I think. When you serve most of
>> requests from disks as in your large data set scenario, make sure you have
>> adequate disk sub-system and
>> that it is configured properly. Block Cache and OS page can not help you
>> in this case as working data set is larger than both caches.
>>
>> Good performance numbers in small data set scenario are explained by the
>> fact that data fits into OS page cache and Block Cache - you do not read
>> data from disk even if
>> you disable block cache.
>>
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: vrodionov@carrieriq.com
>>
>> ________________________________________
>> From: Ramu M S [ramu.malur@gmail.com]
>> Sent: Tuesday, October 08, 2013 6:00 PM
>> To: user@hbase.apache.org
>> Subject: Re: HBase Random Read latency > 100ms
>>
>> Hi All,
>>
>> After few suggestions from the mails earlier I changed the following,
>>
>> 1. Heap Size to 16 GB
>> 2. Block Size to 16KB
>> 3. HFile size to 8 GB (Table now has 256 regions, 32 per server)
>> 4. Data Locality Index is 100 in all RS
>>
>> I have clients running in 10 machines, each with 4 threads. So total 40.
>> This is same in all tests.
>>
>> Result:
>>            1. Average latency is still >100ms.
>>            2. Heap occupancy is around 2-2.5 GB in all RS
>>
>> Few more tests carried out yesterday,
>>
>> TEST 1: Small data set (100 Million records, each with 724 bytes).
>> ===========================================
>> Configurations:
>> 1. Heap Size to 1 GB
>> 2. Block Size to 16KB
>> 3. HFile size to 1 GB (Table now has 128 regions, 16 per server)
>> 4. Data Locality Index is 100 in all RS
>>
>> I disabled Block Cache on the table, to make sure I read everything from
>> disk, most of the time.
>>
>> Result:
>>    1. Average Latency is 8ms and throughput went up to 6K/Sec per RS.
>>    2. With Block Cache enabled again, I got average latency around 2ms
>> and throughput of 10K/Sec per RS.
>>        Heap occupancy around 650 MB
>>    3. Increased the Heap to 16GB, with Block Cache still enabled, I got
>> average latency around 1 ms and throughput 20K/Sec per RS
>>        Heap Occupancy around 2-2.5 GB in all RS
>>
>> TEST 2: Large Data set (1.8 Billion records, each with 724 bytes)
>> ==================================================
>> Configurations:
>> 1. Heap Size to 1 GB
>> 2. Block Size to 16KB
>> 3. HFile size to 1 GB (Table now has 2048 regions, 256 per server)
>> 4. Data Locality Index is 100 in all RS
>>
>> Result:
>>   1. Average Latency is > 500ms to start with and gradually decreases, but
>> even after around 100 Million reads it is still >100 ms
>>   2. Block Cache = TRUE/FALSE does not make any difference here. Even Heap
>> Size (1GB / 16GB) does not make any difference.
>>   3. Heap occupancy is around 2-2.5 GB under 16GB Heap and around 650 MB
>> under 1GB Heap.
>>
>> GC Time in all of the scenarios is around 2ms/Second, as shown in the
>> Cloudera Manager.
>>
>> Reading most of the items from Disk in less data scenario gives better
>> results and very low latencies.
>>
>> Number of regions per RS and HFile size does make a huge difference in my
>> Cluster.
>> Keeping 100 Regions per RS as max(Most of the discussions suggest this),
>> should I restrict the HFile size to 1GB? and thus reducing the storage
>> capacity (From 700 GB to 100GB per RS)?
>>
>> Please advice.
>>
>> Thanks,
>> Ramu
>>
>>
>> On Wed, Oct 9, 2013 at 4:58 AM, Vladimir Rodionov
>> <vrodionov@carrieriq.com>wrote:
>>
>> > What are your current heap and block cache sizes?
>> >
>> > Best regards,
>> > Vladimir Rodionov
>> > Principal Platform Engineer
>> > Carrier IQ, www.carrieriq.com
>> > e-mail: vrodionov@carrieriq.com
>> >
>> > ________________________________________
>> > From: Ramu M S [ramu.malur@gmail.com]
>> > Sent: Monday, October 07, 2013 10:55 PM
>> > To: user@hbase.apache.org
>> > Subject: Re: HBase Random Read latency > 100ms
>> >
>> > Hi All,
>> >
>> > Average Latency is still around 80ms.
>> > I have done the following,
>> >
>> > 1. Enabled Snappy Compression
>> > 2. Reduce the HFile size to 8 GB
>> >
>> > Should I attribute these results to bad Disk Configuration OR anything
>> else
>> > to investigate?
>> >
>> > - Ramu
>> >
>> >
>> > On Tue, Oct 8, 2013 at 10:56 AM, Ramu M S <ramu.malur@gmail.com> wrote:
>> >
>> > > Vladimir,
>> > >
>> > > Thanks for the Insights into Future Caching features. Looks very
>> > > interesting.
>> > >
>> > > - Ramu
>> > >
>> > >
>> > > On Tue, Oct 8, 2013 at 10:45 AM, Vladimir Rodionov <
>> > > vrodionov@carrieriq.com> wrote:
>> > >
>> > >> Ramu,
>> > >>
>> > >> If your working set of data fits into 192GB you may get additional
>> boost
>> > >> by utilizing OS page cache, or wait until
>> > >> 0.98 release which introduces new bucket cache implementation (port
>> of
>> > >> Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not
>> > released
>> > >> yet
>> > >> but is due soon). Both caches stores data off-heap, but Facebook
>> version
>> > >> can store encoded and compressed data and vanilla bucket cache does
>> not.
>> > >> There are some options how to utilize efficiently available RAM (at
>> > least
>> > >> in upcoming HBase releases)
>> > >> . If your data set does not fit RAM then your only hope is your 24
>> SAS
>> > >> drives. Depending on your RAID settings, disk IO perf, HDFS
>> > configuration
>> > >> (I think the latest Hadoop is preferable here).
>> > >>
>> > >> OS page cache is most vulnerable and volatile, it can not be
>> controlled
>> > >> and can be easily polluted by either some other processes or by HBase
>> > >> itself (long scan).
>> > >> With Block cache you have more control but the first truly usable
>> > >> *official* implementation is going to be a part of 0.98 release.
>> > >>
>> > >> As far as I understand, your use case would definitely covered by
>> > >> something similar to BigTable ScanCache (RowCache) , but there is no
>> > such
>> > >> cache in HBase yet.
>> > >> One major advantage of RowCache vs BlockCache (apart from being much
>> > more
>> > >> efficient in RAM usage) is resilience to Region compactions. Each
>> minor
>> > >> Region compaction invalidates partially
>> > >> Region's data in BlockCache and major compaction invalidates this
>> > >> Region's data completely. This is not the case with RowCache (would
>> it
>> > be
>> > >> implemented).
>> > >>
>> > >> Best regards,
>> > >> Vladimir Rodionov
>> > >> Principal Platform Engineer
>> > >> Carrier IQ, www.carrieriq.com
>> > >> e-mail: vrodionov@carrieriq.com
>> > >>
>> > >> ________________________________________
>> > >> From: Ramu M S [ramu.malur@gmail.com]
>> > >> Sent: Monday, October 07, 2013 5:25 PM
>> > >> To: user@hbase.apache.org
>> > >> Subject: Re: HBase Random Read latency > 100ms
>> > >>
>> > >> Vladimir,
>> > >>
>> > >> Yes. I am fully aware of the HDD limitation and wrong configurations
>> wrt
>> > >> RAID.
>> > >> Unfortunately, the hardware is leased from others for this work and
I
>> > >> wasn't consulted to decide the h/w specification for the tests that
>> I am
>> > >> doing now. Even the RAID cannot be turned off or set to RAID-0
>> > >>
>> > >> Production system is according to the Hadoop needs (100 Nodes with
16
>> > Core
>> > >> CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely
>> turned
>> > >> off, so we are creating 1 Virtual Disk containing only 1 Physical
>> Disk
>> > and
>> > >> the VD RAID level set to* *RAID-0). These systems are still not
>> > >> available. If
>> > >> you have any suggestion on the production setup, I will be glad to
>> hear.
>> > >>
>> > >> Also, as pointed out earlier, we are planning to use HBase also as
>> an in
>> > >> memory KV store to access the latest data.
>> > >> That's why RAM was considered huge in this configuration. But looks
>> like
>> > >> we
>> > >> would run into more problems than any gains from this.
>> > >>
>> > >> Keeping that aside, I was trying to get the maximum out of the
>> current
>> > >> cluster or as you said Is 500-1000 OPS the max I could get out of
>> this
>> > >> setup?
>> > >>
>> > >> Regards,
>> > >> Ramu
>> > >>
>> > >>
>> > >>
>> > >> Confidentiality Notice:  The information contained in this message,
>> > >> including any attachments hereto, may be confidential and is intended
>> > to be
>> > >> read only by the individual or entity to whom this message is
>> > addressed. If
>> > >> the reader of this message is not the intended recipient or an agent
>> or
>> > >> designee of the intended recipient, please note that any review, use,
>> > >> disclosure or distribution of this message or its attachments, in any
>> > form,
>> > >> is strictly prohibited.  If you have received this message in error,
>> > please
>> > >> immediately notify the sender and/or Notifications@carrieriq.com and
>> > >> delete or destroy any copy of this message and its attachments.
>> > >>
>> > >
>> > >
>> >
>> > Confidentiality Notice:  The information contained in this message,
>> > including any attachments hereto, may be confidential and is intended
>> to be
>> > read only by the individual or entity to whom this message is
>> addressed. If
>> > the reader of this message is not the intended recipient or an agent or
>> > designee of the intended recipient, please note that any review, use,
>> > disclosure or distribution of this message or its attachments, in any
>> form,
>> > is strictly prohibited.  If you have received this message in error,
>> please
>> > immediately notify the sender and/or Notifications@carrieriq.com and
>> > delete or destroy any copy of this message and its attachments.
>> >
>>
>> Confidentiality Notice:  The information contained in this message,
>> including any attachments hereto, may be confidential and is intended to be
>> read only by the individual or entity to whom this message is addressed. If
>> the reader of this message is not the intended recipient or an agent or
>> designee of the intended recipient, please note that any review, use,
>> disclosure or distribution of this message or its attachments, in any form,
>> is strictly prohibited.  If you have received this message in error, please
>> immediately notify the sender and/or Notifications@carrieriq.com and
>> delete or destroy any copy of this message and its attachments.
>>
>
>

Confidentiality Notice:  The information contained in this message, including any attachments
hereto, may be confidential and is intended to be read only by the individual or entity to
whom this message is addressed. If the reader of this message is not the intended recipient
or an agent or designee of the intended recipient, please note that any review, use, disclosure
or distribution of this message or its attachments, in any form, is strictly prohibited. 
If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com
and delete or destroy any copy of this message and its attachments.

Mime
View raw message