Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 035E990B8 for ; Sat, 14 Apr 2012 03:03:32 +0000 (UTC) Received: (qmail 77923 invoked by uid 500); 14 Apr 2012 03:03:31 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 77874 invoked by uid 500); 14 Apr 2012 03:03:31 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 77852 invoked by uid 99); 14 Apr 2012 03:03:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 14 Apr 2012 03:03:30 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of todd@cloudera.com designates 209.85.214.41 as permitted sender) Received: from [209.85.214.41] (HELO mail-bk0-f41.google.com) (209.85.214.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 14 Apr 2012 03:03:25 +0000 Received: by bkwq16 with SMTP id q16so3782658bkw.14 for ; Fri, 13 Apr 2012 20:03:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding:x-gm-message-state; bh=TMsnhSQSOkIbzZwL1TqGjNKypRWz88AgjjIzPfZPv9s=; b=iKTTp1Rsgu6jV8J8W3Sr+AePYH970KMHdMVT3voJVJBZG00rQD9TImamDWhMTRFW+T dHwfzfrSo2k83O4cvuCbentKdN0vLnKvYvjVbmxS9JQ6PxT7msIB5hwsSBufoXQ8dHQ5 0tCbCbL09j1O0ucRl5oWpjpZIR2ZDI+80eDBsYAE6L+9eW32iSominQx+AEaN8nBNh+8 2+IBG5+HCQFiQVdSWoKSs70KruraebTp2Kw1tIYDlhvTafPwIVPS95RftYAzhV18SBYR yAbV7fC48fzrBiKktvvDqWDbcyS5sw8q1oILUXDzDd+VxWCotfb/71zgcbO9R99z+SSx l8Ig== Received: by 10.204.156.139 with SMTP id x11mr1164173bkw.59.1334372583479; Fri, 13 Apr 2012 20:03:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.205.117.5 with HTTP; Fri, 13 Apr 2012 20:02:43 -0700 (PDT) In-Reply-To: <4F7D0B63.902@cyberagent.co.jp> References: <4F71319A.6080407@cyberagent.co.jp> <4F72CB5E.1070802@cyberagent.co.jp> <52E301F960B30049ADEFBCCF1CCAEF590FCF088E@OAEXCH4SERVER.oa.oclc.org> <52E301F960B30049ADEFBCCF1CCAEF590FCF0B70@OAEXCH4SERVER.oa.oclc.org> <4F73CAAB.2060403@cyberagent.co.jp> <4F7CF936.7000800@cyberagent.co.jp> <4F7D0B63.902@cyberagent.co.jp> From: Todd Lipcon Date: Fri, 13 Apr 2012 20:02:43 -0700 Message-ID: Subject: Re: 0.92 and Read/writes not scaling To: user@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQmrrYLmRg41YRZ+pK31xRx7RhebXlrG6Hgd35xfHf1rfaNH37XWLu38Y+S1y78kllK2eOE1 X-Virus-Checked: Checked by ClamAV on apache.org To close the loop on this thread, we were able to track down the issue. See https://issues.apache.org/jira/browse/HDFS-3280 - just committed it in HDFS. It's a simple patch if you want to patch your own build. Otherwise this should show up in CDH4 nightly builds tonight, and I think in CDH4b2 as well. If you want to patch on the HBase side, you can edit HLog.java to remove the checks for the "sync" method, and have it only call "hflush". It's only the compatibility path that caused the problem. Thanks -Todd On Wed, Apr 4, 2012 at 8:02 PM, Juhani Connolly wrote: > done, thanks for pointing me to that > > > On 04/05/2012 11:43 AM, Ted Yu wrote: >> >> Juhani: >> Thanks for sharing your results. >> >> Do you mind putting the summary on HBASE-5699: Run with> =A01 WAL in >> HRegionServer ? >> >> On Wed, Apr 4, 2012 at 6:45 PM, Juhani Connolly< >> juhani_connolly@cyberagent.co.jp> =A0wrote: >> >>> another quick update on stuff: >>> >>> since moving back to hdfs 0.20.2 (with hbase still at 0.92), we found >>> that >>> while we made significant gains in throughput, that most of our >>> regionservers IPC threads were stuck somewhere in HWal.append(out of 50= , >>> 42 >>> were in append, of which 20 were in sync), limiting throughput despite >>> significant free hardware resources. >>> >>> Because the WAL writes of a single =A0RS all go sequentially to one HDF= S >>> file, we assumed that we could improve throughput by separating writes = to >>> more WAL files and more HDs. To do this we ran multiple region servers = on >>> each node. >>> >>> The scaling =A0wasn't linear(we were in no way increasing hardware, jus= t >>> the >>> number of regionservers), but we are now getting significantly more >>> throughput. >>> I would personally not say that this is a great approach to have to tak= e, >>> it would generally be better to build more smaller servers which will >>> thus >>> not limit themselves by trying to put a lot of data per server through = a >>> single WAL file. >>> >>> Of course there may be another solution to this that I'm not aware of? = If >>> so I'd love to hear it. >>> > --=20 Todd Lipcon Software Engineer, Cloudera