Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F05D6C95 for ; Thu, 21 Jun 2012 11:27:53 +0000 (UTC) Received: (qmail 69102 invoked by uid 500); 21 Jun 2012 11:27:52 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 68795 invoked by uid 500); 21 Jun 2012 11:27:51 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 68768 invoked by uid 99); 21 Jun 2012 11:27:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Jun 2012 11:27:51 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of martinalig0@gmail.com designates 209.85.214.169 as permitted sender) Received: from [209.85.214.169] (HELO mail-ob0-f169.google.com) (209.85.214.169) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Jun 2012 11:27:44 +0000 Received: by obhx4 with SMTP id x4so1068995obh.14 for ; Thu, 21 Jun 2012 04:27:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=srFisdTngA3ffbfYhBWmxCZl9boJZjxYctBzs1nLfDo=; b=02jtf4G7MPmighcMVmDIXSTaWKvgjXD93sxeddUK4Wqg1zx/v3JeAID4uPqW7PGX0W lX1+1PaBQfQiq6+i4QB5qm66uctTDj7toI26uJTxkOTb1+WA53yq3BD2knB3IYrrb5C9 kFF6iP2tYQqtP/ru+SBguQtY0RpZke2yRU6oXb8RtQ2f5mELVRZ+GZpXtoQ81CQuf/aK 9mQYii8UxGYtpZUYNf24qJLFGHBOWz818ux8DPQ2sx3GZLUgI/n0ENSPEYjvPQaHO6Y9 iuF8/B85JB2b6C+h65HoMRthDra5CVuT795SgOaogeYVuRYYI+hchelD3OocF3Jczf1Q R4UQ== MIME-Version: 1.0 Received: by 10.60.8.8 with SMTP id n8mr27901541oea.38.1340278043656; Thu, 21 Jun 2012 04:27:23 -0700 (PDT) Received: by 10.60.150.209 with HTTP; Thu, 21 Jun 2012 04:27:23 -0700 (PDT) In-Reply-To: References: Date: Thu, 21 Jun 2012 13:27:23 +0200 Message-ID: Subject: Re: Blocking Inserts From: Martin Alig To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=e89a8ff1c6d4b1427904c2f9cccd --e89a8ff1c6d4b1427904c2f9cccd Content-Type: text/plain; charset=ISO-8859-1 Thank you for the suggestions. So I changed the setup and now have: 1 Master running Namenode, SecondaryNamenode, ZK and the HMaster 7 Slaves running Datanode and Regionserver 2 Clients to insert data What I forgot in my first post, that sometimes the clients even get a SocketTimeOutException when inserting the data. (of course during that time 0 inserts are done) By looking at the logs, (I also turned on the gc logs) I see the following: Multiple consecutive entries like: 2012-06-21 11:42:13,962 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 6 on 60020' on region usertable,user600,1340200683555.a45b03dd65a62afa676488921e47dbaa.: memstore size 1.0g is >= than blocking 1.0g size Shortly after those entries, many entries like: 2012-06-21 12:43:53,028 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":35046,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@2642a14d), rpc version=1, client version=29, methodsFingerPrint=-1508511443","client":" 10.110.129.12:54624 ","starttimems":1340275397981,"queuetimems":0,"class":"HRegionServer","responsesize":0,"method":"multi"} Looking at the gc-logs, many entries like: 2870.329: [GC 2870.330: [ParNew: 108450K->3401K(118016K), 0.0182570 secs] 4184711K->4079843K(12569856K), 0.0183510 secs] [Times: user=0.24 sys=0.00, real=0.01 secs] But always arround 0.01 secs - 0.04secs. And also from the gc-log: 2696.013: [CMS-concurrent-sweep: 8.999/10.448 secs] [Times: user=46.93 sys=2.24, real=10.45 secs] Is the 10.45 secs too long? Or what exactly should I watch out for in the gc logs? I also configured ganglia to have a look at some more metrics. Looking at io_wait (which should matter concerning my question to the disks), I can observe values between 10 % and 25 % on the regionserver. Should that be lower? Btw. I'm using HBase 0.94 and Hadoop 1.0.3. Thank you again. Martin On Wed, Jun 20, 2012 at 7:04 PM, Dave Wang wrote: > I'd also remove the DN and RS from the node running ZK, NN, etc. as you > don't want heavweight processes on that node. > > - Dave > > On Wed, Jun 20, 2012 at 9:31 AM, Elliott Clark >wrote: > > > Basically without metrics on what's going on it's tough to know for sure. > > > > I would turn on GC logging and make sure that is not playing a part, get > > metrics on IO while this is going on, and look through the logs to see > what > > is happening when you notice the pause. > > > > On Wed, Jun 20, 2012 at 6:39 AM, Martin Alig > > wrote: > > > > > Hi > > > > > > I'm doing some evaluations with HBase. The workload I'm facing is > mainly > > > insert-only. > > > Currently I'm inserting 1KB rows, where 100Bytes go into one column. > > > > > > I have the following cluster machines at disposal: > > > > > > Intel Xeon L5520 2.26 Ghz (Nehalem, with HT enabled) > > > 24 GiB Memory > > > 1 GigE > > > 2x 15k RPM Sas 73 GB (RAID1) > > > > > > I have 10 Nodes. > > > The first node runs: > > > > > > Namenode, SecondaryNamenode, Datanode, HMaster, Zookeeper, and a > > > RegionServer > > > > > > The other nodes run: > > > > > > Datanode and RegionServer > > > > > > > > > Now running my test client and inserting rows, the throughput goes up > to > > > 150'000 inserts/sec. But then after some time the throughput drops down > > to > > > 0 inserts/sec for quite some time, before it goes up again. > > > My assumption is, that it happens when the RegionServers start to write > > the > > > data from memory to the disks. I know, that the recommended hardware > for > > > HBase should contain multiple disks using JBOD or RAID 0. > > > But at that point I am limited right now. > > > > > > I am just asking if in my hardware setup, the blocking periods are > really > > > caused by the non-optimal disk configuration. > > > > > > > > > Thank you in advance for any suggestions. > > > > > > > > > Martin > > > > > > --e89a8ff1c6d4b1427904c2f9cccd--