Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0B1A31013E for ; Tue, 4 Jun 2013 11:25:24 +0000 (UTC) Received: (qmail 58811 invoked by uid 500); 4 Jun 2013 11:25:21 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 58228 invoked by uid 500); 4 Jun 2013 11:25:20 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 58006 invoked by uid 99); 4 Jun 2013 11:25:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jun 2013 11:25:20 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of amit.mor.mail@gmail.com designates 209.85.223.178 as permitted sender) Received: from [209.85.223.178] (HELO mail-ie0-f178.google.com) (209.85.223.178) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jun 2013 11:25:13 +0000 Received: by mail-ie0-f178.google.com with SMTP id f4so184401iea.37 for ; Tue, 04 Jun 2013 04:24:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=JX75GryY1CYT7R+0Tybup9zetj0rH1rzVi6p8RlGbrE=; b=SWV403mTyVwhZnDecrwxa41n8ViT4nmo1dnPmBVetzEYowYh9dx5edUKYjiBd3V8rB sT0tq23Nu4uqDE9FBjyj7Q/uFnc4yQM6Sqn7BoH8PwdzQ8nYyo0W4vYJf9E8LYN6mj4Q y51lK4vzZnwze5ybcpzPJG/Qv236HAcirmglumJJMhtU5DkQCBsK4CBi3VlhVmWbSIoO 7bkk2S3c5HJq13H8/6Rf5wDO1DuMcxyqVGu/9t1hP7d8UO2XhiHPPfsOK5IY/VTjqh9+ BwhkVtZ/actquQsmkW6aO8KrnW8zXXKTj5UfdwD7T8YAfNpQ3oXQdnjCP1SNO1ZN6EFH 7tBQ== MIME-Version: 1.0 X-Received: by 10.43.130.6 with SMTP id hk6mr7047279icc.15.1370345091898; Tue, 04 Jun 2013 04:24:51 -0700 (PDT) Received: by 10.64.141.7 with HTTP; Tue, 4 Jun 2013 04:24:51 -0700 (PDT) In-Reply-To: <391D65D0EBFC9B4B95E117F72A360F1A010452FF@SHSMSX101.ccr.corp.intel.com> References: <391D65D0EBFC9B4B95E117F72A360F1A01044C34@SHSMSX101.ccr.corp.intel.com> <391D65D0EBFC9B4B95E117F72A360F1A010452FF@SHSMSX101.ccr.corp.intel.com> Date: Tue, 4 Jun 2013 14:24:51 +0300 Message-ID: Subject: Re: what's the typical scan latency? From: Amit Mor To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=001a11c1fb426c230f04de525410 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1fb426c230f04de525410 Content-Type: text/plain; charset=ISO-8859-1 What's your blockCacheHitCachingRatio ? It would tell you about the ratio of scans requested from cache (default) to the scans actually served from the block cache. You can get that from the RS web ui. What you are seeing can almost map to anything, for example: is scanner caching (client side) enabled ? if so, how many rows are cached (how many rows returned by the scanner.next RPC call) ? what's your HFile block size, block cache % of total RS heap, max number of RPCs per RS for client connections, tcpnodelay, your network topology and jitter, number of NICs. Are you using HTableInterface connection pool ? HBase client is synchronous, so how do achieve concurrency ? What about your percentiles ? is 5ms the mean ? median ? is 20ms only in the 99% percentile, etc. etc. etc ... I am far from considering my self an expert on the general topic of HBase, so take my tips with a pinch of salt - these are just factors I've considered when trying to optimize my read latency. Hope that helps. On Tue, Jun 4, 2013 at 4:02 AM, Liu, Raymond wrote: > Thanks Amit > > In my envionment, I run a dozens of client to read about 5-20K data per > scan concurrently, And the average read latency for cached data is around > 5-20ms. > So it seems there must be something wrong with my cluster env or > application. Or did you run that with multiple client? > > > >Depends on so much environment related variables and on data as well. > >But to give you a number after all: > >One of our clusters is on EC2, 6 RS, on m1.xlarge machines (network > performance 'high' according to aws), with 90% of the time we do reads; our > avg data size is 2K, block cache at 20K, 100 rows per scan avg, bloom > filters 'on' at the 'ROW' level, 40% of heap dedicated to block cache (note > that it contains several other bits and pieces) and I would say our average > latency for cached data (~97% blockCacheHitCachingRatio) is 3-4ms. File > system access is much much painful, especially on ec2 m1.xlarge where you > really can't tell what's going on, as far as I can tell. To tell you the > truth as I see it, this is an abuse (for our use case) of the HBase store > and for cache like behavior I would recommend going to something like Redis. > > > On Mon, Jun 3, 2013 at 12:13 PM, ramkrishna vasudevan < > ramkrishna.s.vasudevan@gmail.com> wrote: > > > What is that you are observing now? > > > > Regards > > Ram > > > > > > On Mon, Jun 3, 2013 at 2:00 PM, Liu, Raymond > > wrote: > > > > > Hi > > > > > > If all the data is already in RS blockcache. > > > Then what's the typical scan latency for scan a few rows > > > from a say several GB table ( with dozens of regions ) on a small > > > cluster with > > say > > > 4 RS ? > > > > > > A few ms? Tens of ms? Or more? > > > > > > Best Regards, > > > Raymond Liu > > > > > > --001a11c1fb426c230f04de525410--