Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of ramu.malur@gmail.com
 designates 209.85.160.50 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <DC5EBE7F3610EB4CA5C7E92D76873E15180DD42ACF@exchange2007.carrieriq.com>
References: 
 <CADEELOSdw4G7w+dqZus01aMgs0iDBO4USOBum6D031LThRVcdg@mail.gmail.com>
	<1381123300.88874.YahooMailNeo@web140602.mail.bf1.yahoo.com>
	<CADEELOS2wXncw5vxOOaeXnwEz5mkSrLDkVLeB5H+bhxVOZ3W7g@mail.gmail.com>
	<CADEELOT0N_fDLep+TU7qaWUN_1ZoRvHgRt9HgMy8s6zUT7U5bg@mail.gmail.com>
	<CADEELORip23695HRKXt6O8Vbu3VFgDmteP5EF6yuawevEkeD2A@mail.gmail.com>
	<1381126892.93964.YahooMailNeo@web140603.mail.bf1.yahoo.com>
	<CADEELOTFEKLMntu=A4r=t5ueHJQ0qr-cwD3jXe=hJifYYyRUww@mail.gmail.com>
	<CADEELOTNTWXCQ5o9RCk2yEc1K0oQN+FV28mh5Z9JLkzWFya+Cw@mail.gmail.com>
	<CAFWiQHYgmTOn4EaOdhdEnKFjFJAkdY98KcKQ-dEqDhdGF9F41A@mail.gmail.com>
	<CADEELOQQQXKHj-TomWkGRQRRdsGdCm0DDzMMvYH9312GHqu9fQ@mail.gmail.com>
	<CADEELORhEuLfFNn5nMH4YV_2g67ohG_GuG+MhKHL=kcqazW9Nw@mail.gmail.com>
	<CAPQV63VsPFp4tQpxrYmQ5sKKj3o3DbQM5bgLQRTSTFxtcfThXg@mail.gmail.com>
	<CADEELOT=JOhqLQftE9YOSBP+hWDecpNrmywRo5HaWfH1qBdc5A@mail.gmail.com>
	<CADEELOT6sPDNt2cYu6KtV_11V9=SnOQA8yN1YwW2vpziN18AqA@mail.gmail.com>
	<DC5EBE7F3610EB4CA5C7E92D76873E15180DD42ACB@exchange2007.carrieriq.com>
	<CADEELOQZzN8q28dgv2drPuOjsS-Jq+ZAsSPrwYGBSWKttsy91Q@mail.gmail.com>
	<DC5EBE7F3610EB4CA5C7E92D76873E15180DD42ACC@exchange2007.carrieriq.com>
	<CADEELOTHna0RyhY3ppVZ6PEK1onPax=rsHhRHXi0tCRPQ0y35Q@mail.gmail.com>
	<CADEELOSTm63nN1NaH0r0CY8SCmyL-ADRnPnze9smSv2Tpski=A@mail.gmail.com>
	<DC5EBE7F3610EB4CA5C7E92D76873E15180DD42ACF@exchange2007.carrieriq.com>
Date: Wed, 9 Oct 2013 10:00:44 +0900
Message-ID: 
 <CADEELOTuYtKSCkm1gDuwGdHE_+HocyuUdYvF+wyO6pRJow_ajg@mail.gmail.com>
Subject: Re: HBase Random Read latency > 100ms
From: Ramu M S <ramu.malur@gmail.com>
To: user@hbase.apache.org
Content-Type: multipart/alternative; boundary=001a1136335434ae4404e8446ae9

--001a1136335434ae4404e8446ae9
Content-Type: text/plain; charset=ISO-8859-1

Hi All,

After few suggestions from the mails earlier I changed the following,

1. Heap Size to 16 GB
2. Block Size to 16KB
3. HFile size to 8 GB (Table now has 256 regions, 32 per server)
4. Data Locality Index is 100 in all RS

I have clients running in 10 machines, each with 4 threads. So total 40.
This is same in all tests.

Result:
           1. Average latency is still >100ms.
           2. Heap occupancy is around 2-2.5 GB in all RS

Few more tests carried out yesterday,

TEST 1: Small data set (100 Million records, each with 724 bytes).
===========================================
Configurations:
1. Heap Size to 1 GB
2. Block Size to 16KB
3. HFile size to 1 GB (Table now has 128 regions, 16 per server)
4. Data Locality Index is 100 in all RS

I disabled Block Cache on the table, to make sure I read everything from
disk, most of the time.

Result:
   1. Average Latency is 8ms and throughput went up to 6K/Sec per RS.
   2. With Block Cache enabled again, I got average latency around 2ms
and throughput of 10K/Sec per RS.
       Heap occupancy around 650 MB
   3. Increased the Heap to 16GB, with Block Cache still enabled, I got
average latency around 1 ms and throughput 20K/Sec per RS
       Heap Occupancy around 2-2.5 GB in all RS

TEST 2: Large Data set (1.8 Billion records, each with 724 bytes)
==================================================
Configurations:
1. Heap Size to 1 GB
2. Block Size to 16KB
3. HFile size to 1 GB (Table now has 2048 regions, 256 per server)
4. Data Locality Index is 100 in all RS

Result:
  1. Average Latency is > 500ms to start with and gradually decreases, but
even after around 100 Million reads it is still >100 ms
  2. Block Cache = TRUE/FALSE does not make any difference here. Even Heap
Size (1GB / 16GB) does not make any difference.
  3. Heap occupancy is around 2-2.5 GB under 16GB Heap and around 650 MB
under 1GB Heap.

GC Time in all of the scenarios is around 2ms/Second, as shown in the
Cloudera Manager.

Reading most of the items from Disk in less data scenario gives better
results and very low latencies.

Number of regions per RS and HFile size does make a huge difference in my
Cluster.
Keeping 100 Regions per RS as max(Most of the discussions suggest this),
should I restrict the HFile size to 1GB? and thus reducing the storage
capacity (From 700 GB to 100GB per RS)?

Please advice.

Thanks,
Ramu


On Wed, Oct 9, 2013 at 4:58 AM, Vladimir Rodionov
<vrodionov@carrieriq.com>wrote:

> What are your current heap and block cache sizes?
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Ramu M S [ramu.malur@gmail.com]
> Sent: Monday, October 07, 2013 10:55 PM
> To: user@hbase.apache.org
> Subject: Re: HBase Random Read latency > 100ms
>
> Hi All,
>
> Average Latency is still around 80ms.
> I have done the following,
>
> 1. Enabled Snappy Compression
> 2. Reduce the HFile size to 8 GB
>
> Should I attribute these results to bad Disk Configuration OR anything else
> to investigate?
>
> - Ramu
>
>
> On Tue, Oct 8, 2013 at 10:56 AM, Ramu M S <ramu.malur@gmail.com> wrote:
>
> > Vladimir,
> >
> > Thanks for the Insights into Future Caching features. Looks very
> > interesting.
> >
> > - Ramu
> >
> >
> > On Tue, Oct 8, 2013 at 10:45 AM, Vladimir Rodionov <
> > vrodionov@carrieriq.com> wrote:
> >
> >> Ramu,
> >>
> >> If your working set of data fits into 192GB you may get additional boost
> >> by utilizing OS page cache, or wait until
> >> 0.98 release which introduces new bucket cache implementation (port of
> >> Facebook L2 cache). You can try vanilla bucket cache in 0.96 (not
> released
> >> yet
> >> but is due soon). Both caches stores data off-heap, but Facebook version
> >> can store encoded and compressed data and vanilla bucket cache does not.
> >> There are some options how to utilize efficiently available RAM (at
> least
> >> in upcoming HBase releases)
> >> . If your data set does not fit RAM then your only hope is your 24 SAS
> >> drives. Depending on your RAID settings, disk IO perf, HDFS
> configuration
> >> (I think the latest Hadoop is preferable here).
> >>
> >> OS page cache is most vulnerable and volatile, it can not be controlled
> >> and can be easily polluted by either some other processes or by HBase
> >> itself (long scan).
> >> With Block cache you have more control but the first truly usable
> >> *official* implementation is going to be a part of 0.98 release.
> >>
> >> As far as I understand, your use case would definitely covered by
> >> something similar to BigTable ScanCache (RowCache) , but there is no
> such
> >> cache in HBase yet.
> >> One major advantage of RowCache vs BlockCache (apart from being much
> more
> >> efficient in RAM usage) is resilience to Region compactions. Each minor
> >> Region compaction invalidates partially
> >> Region's data in BlockCache and major compaction invalidates this
> >> Region's data completely. This is not the case with RowCache (would it
> be
> >> implemented).
> >>
> >> Best regards,
> >> Vladimir Rodionov
> >> Principal Platform Engineer
> >> Carrier IQ, www.carrieriq.com
> >> e-mail: vrodionov@carrieriq.com
> >>
> >> ________________________________________
> >> From: Ramu M S [ramu.malur@gmail.com]
> >> Sent: Monday, October 07, 2013 5:25 PM
> >> To: user@hbase.apache.org
> >> Subject: Re: HBase Random Read latency > 100ms
> >>
> >> Vladimir,
> >>
> >> Yes. I am fully aware of the HDD limitation and wrong configurations wrt
> >> RAID.
> >> Unfortunately, the hardware is leased from others for this work and I
> >> wasn't consulted to decide the h/w specification for the tests that I am
> >> doing now. Even the RAID cannot be turned off or set to RAID-0
> >>
> >> Production system is according to the Hadoop needs (100 Nodes with 16
> Core
> >> CPU, 192 GB RAM, 24 X 600GB SAS Drives, RAID cannot be completely turned
> >> off, so we are creating 1 Virtual Disk containing only 1 Physical Disk
> and
> >> the VD RAID level set to* *RAID-0). These systems are still not
> >> available. If
> >> you have any suggestion on the production setup, I will be glad to hear.
> >>
> >> Also, as pointed out earlier, we are planning to use HBase also as an in
> >> memory KV store to access the latest data.
> >> That's why RAM was considered huge in this configuration. But looks like
> >> we
> >> would run into more problems than any gains from this.
> >>
> >> Keeping that aside, I was trying to get the maximum out of the current
> >> cluster or as you said Is 500-1000 OPS the max I could get out of this
> >> setup?
> >>
> >> Regards,
> >> Ramu
> >>
> >>
> >>
> >> Confidentiality Notice:  The information contained in this message,
> >> including any attachments hereto, may be confidential and is intended
> to be
> >> read only by the individual or entity to whom this message is
> addressed. If
> >> the reader of this message is not the intended recipient or an agent or
> >> designee of the intended recipient, please note that any review, use,
> >> disclosure or distribution of this message or its attachments, in any
> form,
> >> is strictly prohibited.  If you have received this message in error,
> please
> >> immediately notify the sender and/or Notifications@carrieriq.com and
> >> delete or destroy any copy of this message and its attachments.
> >>
> >
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or Notifications@carrieriq.com and
> delete or destroy any copy of this message and its attachments.
>

--001a1136335434ae4404e8446ae9--