Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 63009DB7C for ; Sat, 30 Jun 2012 08:27:37 +0000 (UTC) Received: (qmail 46620 invoked by uid 500); 30 Jun 2012 08:27:36 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 46140 invoked by uid 500); 30 Jun 2012 08:27:32 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 46084 invoked by uid 99); 30 Jun 2012 08:27:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Jun 2012 08:27:30 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ryanobjc@gmail.com designates 209.85.212.41 as permitted sender) Received: from [209.85.212.41] (HELO mail-vb0-f41.google.com) (209.85.212.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Jun 2012 08:27:24 +0000 Received: by vbkv13 with SMTP id v13so3433720vbk.14 for ; Sat, 30 Jun 2012 01:27:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=4CL1Wc/n3/arXFOmfmtlo4sKScHc8AgPxhcCW7hA3/I=; b=GGvXUaN0V7hev3uL221Iw8GtoQiPcSa7uS+r2BqK3kXbrPd55GbAqaGwYt18Oelzak yPT9sYGaBzwZZ/VcOBGr4wf0mFXs9rKYX612/yyhZLwC8Lb3WikYVfgIPlL7G9BRPe7C STxkbWX1kLDy28mrpUUQ+KAL3Kfnzr80emakG/91JfyujvxK9SG/4ohZYMvuZ36GLJhD OzDFcoAirCuvtQAxtGMjXgsPK57SQMSr6vQjG/0hYuVHs5zF1/cQvzpxivYISch9lM29 QaSq+NcE4wGkN/gRdAFXJeK2RU7oq+3LJvN3NULpRJrPYlYk/g0tnVRCZHMEmfISa027 OrXg== MIME-Version: 1.0 Received: by 10.52.176.170 with SMTP id cj10mr2191931vdc.31.1341044823738; Sat, 30 Jun 2012 01:27:03 -0700 (PDT) Received: by 10.220.6.199 with HTTP; Sat, 30 Jun 2012 01:27:03 -0700 (PDT) In-Reply-To: References: Date: Sat, 30 Jun 2012 01:27:03 -0700 Message-ID: Subject: Re: HBASE-2182 From: Ryan Rawson To: dev@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Fri, Jun 29, 2012 at 5:04 PM, Todd Lipcon wrote: > A few inline notes below: > > On Fri, Jun 29, 2012 at 4:42 PM, Elliott Clark wr= ote: > >> I just posted a pretty early skeleton( >> https://issues.apache.org/jira/browse/HBASE-2182) on what I think a nett= y >> based hbase client/server could look like. >> >> Pros: >> >> =A0 - Faster >> =A0 =A0 =A0- Giraph got a 3x perf improvement by droppping hadoop rpc >> > > Whats the reference for this? The 3x perf I heard about from Giraph was > from switching to using LMAX's Disruptor instead of queues, internally. W= e > could do the same, but I'm not certain the model works well for our use > cases where the RPC processing can end up blocked on disk access, etc. > > >> =A0 =A0 =A0- Asynhbase trounces our client when JD benchmarked them >> > > I'm still convinced that the majority of this has to do with the way our > batching happens to the server, not async vs sync. (in the current sync > client, once we fill up the buffer, we "flush" from the same thread, and > block the flush until all buffered edits have made it, vs doing it in the > background). We could fix this without going to a fully async model. I also agree here, if you do the apriori code analysis, it becomes obvious that the issue is that slower regionservers can hold up entire batches even if 90%+ of the Puts were already acked... And don't forget that we used to issue Puts to regionservers SERIALLY until we do the current parallelism code... (not that the code is great, but it was relatively easy to fix at the time). > > >> =A0 - Could encourage things to be a little more modular if everything i= sn't >> =A0 hanging directly off of HRegionServer >> > Sure, but not sure I see why this is Netty vs not-Netty > > >> =A0 - Netty is better about thread usage than hadoop rpc server. >> > Can you explain further? > > >> =A0 - Pretty easy to define an rpc protocol after all of the work on >> =A0 protobuf (Thanks everyone) >> =A0 - Decoupling the rpc server library from the hadoop library could al= low >> =A0 us to rev the server code easier. >> =A0 - The filter model is very easy to work with. >> =A0 =A0 =A0- Security can be just a single filter. >> =A0 =A0 =A0- Logging can ba another >> =A0 =A0 =A0- Stats can be another. >> >> Cons: >> >> =A0 - Netty and non apache rpc server's don't play well togther. =A0They= might >> =A0 be able to but I haven't gotten there yet. >> > What do you mean "non apache rpc servers"? > > >> =A0 - Complexity >> =A0 =A0 =A0- Two different servers in the src >> =A0 =A0 =A0- Confusing users who don't know which to pick >> =A0 - Non-blocking could make the client a harder to write. >> >> >> I'm really just trying to gauge what people think of the direction and i= f >> it's still something that is wanted. =A0The code is a loooooong way from= even >> being a tech demo, and I'm not a netty expert, so suggestions would be >> welcomed. >> >> Thoughts ? Are people interested in this? Should I push this to my githu= b >> so other can help ? >> > > IMO, I'd want to see a noticeable perf difference from the change - > unfortunately it would take a fair amount of work to get to the point whe= re > you could benchmark it. But if you're willing to spend the time to get to > that point, seems worth investigating. > > -- > Todd Lipcon > Software Engineer, Cloudera