Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 15080F925 for ; Thu, 4 Apr 2013 04:22:12 +0000 (UTC) Received: (qmail 79578 invoked by uid 500); 4 Apr 2013 04:22:10 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 79235 invoked by uid 500); 4 Apr 2013 04:22:09 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 79213 invoked by uid 99); 4 Apr 2013 04:22:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Apr 2013 04:22:08 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of asaf.mesika@gmail.com designates 209.85.219.54 as permitted sender) Received: from [209.85.219.54] (HELO mail-oa0-f54.google.com) (209.85.219.54) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Apr 2013 04:22:02 +0000 Received: by mail-oa0-f54.google.com with SMTP id n12so2341539oag.13 for ; Wed, 03 Apr 2013 21:21:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=kzKuWfmORRx68/jG9y3U6IkNZTRYQR/ZEw5NoECao7s=; b=uEWA2iZoLU03QXsgxmD6fXFDRcXKS+SnVFk/aypVs7t1bIRD2bpfdPaKYsRho1FKZm 4tW/TKBLvUwTvjqU82WZyMM1kCg/QA7Bpn2Hu8WBRGQvvJexLLvs5eFgVJ6W5yq8eood zb/R8nOW6DT0rIx83ed5IMv1A0F5/hslx17CcJA44ZvK6o1EOjwG8Gm4+oKbGQXSy+4T GPmdJ+Xu76JHfY6zaoGSkF1iPBS9JjmBwBxeu6m4awnv3JHWnjnQ1UneyZZ4bxILljje Lksu+E3QDcyeBLXVKYl9EW/NydbAML+0vIjkG5VIJkMWNcumJ+9DHQBocdmWvNvsgMOB NVqw== MIME-Version: 1.0 X-Received: by 10.60.27.136 with SMTP id t8mr2866526oeg.92.1365049301606; Wed, 03 Apr 2013 21:21:41 -0700 (PDT) Received: by 10.60.19.100 with HTTP; Wed, 3 Apr 2013 21:21:41 -0700 (PDT) In-Reply-To: References: <1277447C-C19D-444B-A861-6651106D54B1@gmail.com> Date: Thu, 4 Apr 2013 07:21:41 +0300 Message-ID: Subject: Re: Read thruput From: Asaf Mesika To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=e89a8fb1fe2ab929c804d9814ef1 X-Virus-Checked: Checked by ClamAV on apache.org --e89a8fb1fe2ab929c804d9814ef1 Content-Type: text/plain; charset=UTF-8 Can you possible batch some Get calls to a Scan with a Filter that contains the list of row keys you need? For example, if you have 100 Gets, you can create a start key and end key from getting the max and mix from those 100 row keys list. Next, you need to write a filter which saves this 100 row keys to a private member and uses the hint method in the Filter interface to jump to the closest rowkey in the region it scans. If you need help with that I can add a more detailed description of that Filter. This should reduce most of the heavy weight over head processing of each Get. On Tuesday, April 2, 2013, Vibhav Mundra wrote: > How does your client call looks like? Get? Scan? Filters? > --My client keeps doing the Get request. Each time a single row is fetched. > Essentially we are using Hbase as key value retrieval. > > Is 3000/sec is client side calls or is it in numbers of rows per sec? > --3000/sec is the client side calls. > > If you measure in MB/sec how much read throughput do you get? > --Each client request's response is at maximum 1 KB so its the MB/sec is > 3MB { 3000 * 1 KB }. > > Where is your client located? Same router as the cluster? > --It is routed on the same cluster, on the same subnet. > > Have you activated dfs read short circuit? Of not try it. > --I have not tried this. Let me try this also. > > Compression - try switching to Snappy - should be faster. > What else is running on the cluster parallel to your reading client? > --There is small upload code running. I have never seen the CPU usage more > than 5%, so actually didnt bother to look at this angle. > > -Vibhav > > > On Tue, Apr 2, 2013 at 1:42 AM, Asaf Mesika wrote: > > > How does your client call looks like? Get? Scan? Filters? > > Is 3000/sec is client side calls or is it in numbers of rows per sec? > > If you measure in MB/sec how much read throughput do you get? > > Where is your client located? Same router as the cluster? > > Have you activated dfs read short circuit? Of not try it. > > Compression - try switching to Snappy - should be faster. > > What else is running on the cluster parallel to your reading client? > > > > On Monday, April 1, 2013, Vibhav Mundra wrote: > > > > > What is the general read-thru put that one gets when using Hbase. > > > > > > I am not to able to achieve more than 3000/secs with a timeout of 50 > > > millisecs. > > > In this case also there is 10% of them are timing-out. > > > > > > -Vibhav > > > > > > > > > On Mon, Apr 1, 2013 at 11:20 PM, Vibhav Mundra > wrote: > > > > > > > yes, I have changes the BLOCK CACHE % to 0.35. > > > > > > > > -Vibhav > > > > > > > > > > > > On Mon, Apr 1, 2013 at 10:20 PM, Ted Yu wrote: > > > > > > > >> I was aware of that discussion which was about MAX_FILESIZE and > > > BLOCKSIZE > > > >> > > > >> My suggestion was about block cache percentage. > > > >> > > > >> Cheers > > > >> > > > >> > > > >> On Mon, Apr 1, 2013 at 4:57 AM, Vibhav Mundra > > wrote: > > > >> > > > >> > I have used the following site: > > > >> > http://grokbase.com/t/hbase/user/11bat80x7m/row-get-very-slow > > > >> > > > > >> > to lessen the value of block cache. > > > >> > > > > >> > -Vibhav > > > >> > > > > >> > > > > >> > On Mon, Apr 1, 2013 at 4:23 PM, Ted wrote: > > > >> > > > > >> > > Can you increase block cache size ? > > > >> > > > > > >> > > What version of hbase are you using ? > > > >> > > > > > >> > > Thanks > > > >> > > > > > >> > > On Apr 1, 2013, at 3:47 AM, Vibhav Mundra > > wrote: > > > >> > > > > > >> > > > The typical size of each of my row is less than 1KB. > > > >> > > > > > > >> > > > Regarding the memory, I have used 8GB for Hbase regionservers > > and > > > 4 > > > >> GB > > > >> > > for > > > >> > > > datanodes and I dont see them completely used. So I ruled out > > the > > > GC > > > >> > > aspect. > > > >> > > > > > > >> > > > In case u still believe that GC is an issue, I will upload the > > gc > > > >> logs. > > > >> > > > > > > >> > > > -Vibhav > > > >> > > > > > > >> > > > > > > >> > > > On Mon, Apr 1, 2013 at 3:46 PM, ramkrishna vasudevan < > > > >> > > > ramkrishna.s.vasudevan@gmail.com> wrote: > > > >> > > > > > > >> > > >> Hi > > > >> > > >> > > > >> > > >> How big is your row? Are they wider rows and what would be > the > > > >> size > > > >> > of > > > >> > > >> every cell? > > > >> > > >> How many read threads are getting used? > > > >> > > >> > > > >> > > >> > > > > --e89a8fb1fe2ab929c804d9814ef1--