Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DE367D51A for ; Sun, 13 Jan 2013 07:23:27 +0000 (UTC) Received: (qmail 5727 invoked by uid 500); 13 Jan 2013 07:23:25 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 5689 invoked by uid 500); 13 Jan 2013 07:23:25 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 5598 invoked by uid 99); 13 Jan 2013 07:23:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Jan 2013 07:23:25 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of anoop.hbase@gmail.com designates 209.85.214.171 as permitted sender) Received: from [209.85.214.171] (HELO mail-ob0-f171.google.com) (209.85.214.171) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Jan 2013 07:23:18 +0000 Received: by mail-ob0-f171.google.com with SMTP id dn14so3009888obc.30 for ; Sat, 12 Jan 2013 23:22:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=a87TiDMZVZGCmXdfVSiEtj5uyRCJexSjyv7SgyVdkK8=; b=dz4GbyB0FLC0bRFPa09S51/ml64UKjWj80/hk+VEKR/aO0AYgFY3SveLUPKyOwp5eS cgSkbSARqKeneY0t+Z4vXxUKmS0zBRJ9a6SsSQWhfc5jJcHz5imtPsKfNYZ1/PjnjWn6 THywp5XsXJiUZz6JOJoV/qXc3dNFk/908ycRXHwq36xg+vPDAC1R8rPO0wnSEiDu0jDT Wc1e8phj+g9SKcM0dxmhyVMm/FfbziifckYneFl0bnORA8j1CqJvjTw++FUWLfcvkxwF at1U9kuFNCraYMeOth1c3UgJ6LtaAQpVLiw+Bl/KHFRIh22ayO8mtMr7BrkvrfXhjFXm a8bg== MIME-Version: 1.0 Received: by 10.60.168.236 with SMTP id zz12mr49389181oeb.12.1358061777176; Sat, 12 Jan 2013 23:22:57 -0800 (PST) Received: by 10.76.74.106 with HTTP; Sat, 12 Jan 2013 23:22:57 -0800 (PST) In-Reply-To: References: <-7974863129223829771@unknownmsgid> Date: Sun, 13 Jan 2013 12:52:57 +0530 Message-ID: Subject: Re: Increment operations in hbase From: Anoop John To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=bcaec54c53d4cfc6b604d326653b X-Virus-Checked: Checked by ClamAV on apache.org --bcaec54c53d4cfc6b604d326653b Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable >Another alternative is to get store files for each row hosted in that node operating directly on store files for each increment object ?? Sorry didnt get what is the idea. Can you explain pls? Regarding support for Increments in batch API. Sorry I was checking 94 code base. In 0.92 this support is not there. :( Have you done any profiling of the operation at RS side? How many HFiles on an avg per store at this op time and how many CFs for table? Gets seems to be costly for you? Is this bulk increment op only happening at this time? Or some other concurrent ops? Is block cache getting used? Checked cache hit ratio like metric? -Anoop- On Sun, Jan 13, 2013 at 12:20 PM, kiran wrote= : > I am using hbase 0.92.1 and the table is split evenly across 19 nodes and= I > know the node region splitd. I can construct increment objects for each r= ow > hosted in that node according to splits (30-50k approx in 15 min per node= ) > ... > > there is no batch increment support (in api it is given it supports only > get, put and delete)...can I directly use HTable.increment for 30-50k > increment objects in each node sequentially or multithreaded and finish i= n > 15 min. > > Another alternative is to get store files for each row hosted in that nod= e > operating directly on store files for each increment object ?? > > > > On Sun, Jan 13, 2013 at 1:50 AM, Varun Sharma wrote= : > > > IMHO, this seems too low - 1 million operations in 15 minutes translate= s > to > > 2K increment operations per second which should be easy to support. > > Moreover, you are running increments on different rows, so contention d= ue > > to row locks is also not likely to be a problem. > > > > On hbase 0.94.0, I have seen upto 1K increments per second (note that > this > > will be significantly slower than incrementing individual rows because = of > > contention and also this would be limited to 1 node, the one which host= s > > the row). So, I would assume that throughput should be significantly > higher > > for increments across multiple rows. How many nodes are you using and i= s > > the table appropriately split across the nodes. > > > > On Sat, Jan 12, 2013 at 10:59 AM, Ted Yu wrote: > > > > > Can you tell us which version of HBase you are using ? > > > > > > Thanks > > > > > > On Sat, Jan 12, 2013 at 10:57 AM, Asaf Mesika > > > wrote: > > > > > > > Most time is spent reading from Store file and not on network > transfer > > > time > > > > of Increment objects. > > > > > > > > Sent from my iPhone > > > > > > > > On 12 =D7=91=D7=99=D7=A0=D7=95 2013, at 17:40, Anoop John wrote: > > > > > > > > Hi > > > > Can you check with using API HTable#batch()? Here you can > batch a > > > > number of increments for many rows in just one RPC call. Might help > you > > > to > > > > reduce the net time taken. Good luck. > > > > > > > > -Anoop- > > > > > > > > On Sat, Jan 12, 2013 at 4:07 PM, kiran > > > > wrote: > > > > > > > > Hi, > > > > > > > > > > > > My usecase is I need to increment 1 million rows with in 15 mins. I > > tried > > > > > > > > two approaches but none of the yielded results. > > > > > > > > > > > > I have used HTable.increment, but is not getting completed in the > > > specified > > > > > > > > time. I tried multi-threading also but it is very costly. I have al= so > > > > > > > > implemented get and put as other alternative, but that approach is > also > > > not > > > > > > > > getting completed in 15 mins. > > > > > > > > > > > > Can I use any low level implementation like using "Store or > > > HRegionServer" > > > > > > > > to increment 1 million rows. I know the table splits, and region > > servers > > > > > > > > serving them, and rows which fall into table splits. I suspect the > > major > > > > > > > > concern as network I/O rather than processing with the above two > > > > > > > > approaches. > > > > > > > > > > > > -- > > > > > > > > Thank you > > > > > > > > Kiran Sarvabhotla > > > > > > > > > > > > -----Even a correct decision is wrong when it is taken late > > > > > > > > > > > > > -- > Thank you > Kiran Sarvabhotla > > -----Even a correct decision is wrong when it is taken late > --bcaec54c53d4cfc6b604d326653b--