Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 401EEDF32 for ; Thu, 1 Nov 2012 13:36:02 +0000 (UTC) Received: (qmail 92221 invoked by uid 500); 1 Nov 2012 13:36:00 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 91840 invoked by uid 500); 1 Nov 2012 13:35:59 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 91800 invoked by uid 99); 1 Nov 2012 13:35:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Nov 2012 13:35:58 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kevin.odell@cloudera.com designates 209.85.212.41 as permitted sender) Received: from [209.85.212.41] (HELO mail-vb0-f41.google.com) (209.85.212.41) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Nov 2012 13:35:51 +0000 Received: by mail-vb0-f41.google.com with SMTP id v13so3301741vbk.14 for ; Thu, 01 Nov 2012 06:35:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=st6T8EhhgVUH4lxFZVVYsE8R6kO1Zp+wa1x+cQEQtUU=; b=LpbZ2BbATDTMIqvr16MmnG9ZwJRRMQQu90pCiv0Qs86JfbcTjbN5LpvCiHFSBNqlUN OcRGJWvfrU26QLTD753jAza1ETh5pvoKjjdxHotaoC/JTLR/xoVhBnOpIkATc90m1lDO 2RJRHTt8Cm0llmcg+glSGfnCixUg/lY+ZCiS9KnH3s1IMOczyVi29jpe80cGNu4Bx3iY FsYY7J2HMXaO2XVHEbdt+8p+IyTe8QiSpSxW3YINI1bzKdcRH9tfNCc3nL+b1drDrnJv ohbKF7lg3Tg8I0Sny/I1vgC98Jv3LcSqUI65VopZhHaVm4pCd8RFEbzw0GXBYJT7QxOS 85cw== MIME-Version: 1.0 Received: by 10.58.133.3 with SMTP id oy3mr891758veb.15.1351776930063; Thu, 01 Nov 2012 06:35:30 -0700 (PDT) Received: by 10.58.226.196 with HTTP; Thu, 1 Nov 2012 06:35:30 -0700 (PDT) In-Reply-To: References: Date: Thu, 1 Nov 2012 08:35:30 -0500 Message-ID: Subject: Re: Table in Inconsistent State; Perpetually pending region server transitions while loading lot of data into Hbase via MR From: "Kevin O'dell" To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=047d7b67820cbb66ed04cd6f17da X-Gm-Message-State: ALoCoQn7FZz0NdUg9qSbDWsbmuFC6kWveBi+Nbki03kd7bMyy11MImGUAgomQ89cwDxre1juM8BS X-Virus-Checked: Checked by ClamAV on apache.org --047d7b67820cbb66ed04cd6f17da Content-Type: text/plain; charset=ISO-8859-1 Couple thoughts(it is still early here so bear with me): Did you presplit your table? You are on .92, might as well take advantage of HFilev2 and use 10GB region sizes Loading over MR, I am assuming puts? Did you tune your memstore and Hlog size? You aren't using a different client version or something strange like that are you? You can't close hlog messages seem to indicate an inability to talk to HDFS. Did you have connection issues there? On Thu, Nov 1, 2012 at 5:20 AM, ramkrishna vasudevan < ramkrishna.s.vasudevan@gmail.com> wrote: > Can you try restarting the cluster i mean the master and RS. > Also if this things persists try to clear the zk data and restart. > > Regards > Ram > > On Thu, Nov 1, 2012 at 2:46 PM, Cheng Su wrote: > > > Sorry, my mistake. Ignore about the "max store size of a single CF" > please. > > > > m(_ _)m > > > > On Thu, Nov 1, 2012 at 4:43 PM, Ameya Kantikar > wrote: > > > Thanks Cheng. I'll try increasing my max region size limit. > > > > > > However I am not clear with this math: > > > > > > "Since you set the max file size to 2G, you can only store 2XN G data > > > into a single CF." > > > > > > Why is that? My assumption is, even though single region can only be 2 > > GB, > > > I can still have hundreds of regions, and hence can store 200GB+ data > in > > > single CF on my 10 machine cluster. > > > > > > Ameya > > > > > > > > > On Thu, Nov 1, 2012 at 1:19 AM, Cheng Su wrote: > > > > > >> I met same problem these days. > > >> I'm not very sure the error log is exactly same, but I do have the > > >> same exception > > >> > > >> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > > >> Failed 1 action: NotServingRegionException: 1 time, servers with > > >> issues: smartdeals-hbase8-snc1.snc1:60020, > > >> > > >> and the table is also neither enabled nor disabled, thus I can't drop > > it. > > >> > > >> I guess the problem is the total store size. > > >> How many region server do you have? > > >> Since you set the max file size to 2G, you can only store 2XN G data > > >> into a single CF. > > >> (N is the number of your region servers) > > >> > > >> You might want to increase the max file size or region servers. > > >> > > >> On Thu, Nov 1, 2012 at 3:29 PM, Ameya Kantikar > > wrote: > > >> > One more thing, the Hbase table in question is neither enabled, nor > > >> > disabled: > > >> > > > >> > hbase(main):006:0> is_disabled 'userTable1' > > >> > false > > >> > > > >> > 0 row(s) in 0.0040 seconds > > >> > > > >> > hbase(main):007:0> is_enabled 'userTable1' > > >> > false > > >> > > > >> > 0 row(s) in 0.0040 seconds > > >> > > > >> > Ameya > > >> > > > >> > On Thu, Nov 1, 2012 at 12:02 AM, Ameya Kantikar > > >> wrote: > > >> > > > >> >> Hi, > > >> >> > > >> >> I am trying to load lot of data (around 1.5 TB) into a single Hbase > > >> table. > > >> >> I have setup region size at 2 GB. I also > > >> >> set hbase.regionserver.handler.count at 30. > > >> >> > > >> >> When I start loading data via MR, after a while, tasks start > failing > > >> with > > >> >> following error: > > >> >> > > >> >> > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > > >> Failed 1 action: NotServingRegionException: 1 time, servers with > issues: > > >> smartdeals-hbase8-snc1.snc1:60020, > > >> >> at > > >> > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1641) > > >> >> at > > >> > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409) > > >> >> at > > >> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:943) > > >> >> at > org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:820) > > >> >> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:795) > > >> >> at > > >> > > > com..mr.hbase.LoadUserCacheInHbase$TokenizerMapper.map(LoadUserCacheInHbase.java:83) > > >> >> at > > >> > > > com..mr.hbase.LoadUserCacheInHbase$TokenizerMapper.map(LoadUserCacheInHbase.java:33) > > >> >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) > > >> >> at > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > > >> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.j > > >> >> > > >> >> On the hbase8 machine I see following in logs: > > >> >> > > >> >> ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Error while > > >> syncing, requesting close of hlog > > >> >> java.io.IOException: Reflection > > >> >> at > > >> > > > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:230) > > >> >> at > > >> org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1109) > > >> >> at > > >> org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1213) > > >> >> at > > >> > > > org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:1071) > > >> >> at java.lang.Thread.run(Thread.java:662) > > >> >> Caused by: java.lang.reflect.InvocationTargetException > > >> >> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown > > Source) > > >> >> at > > >> > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > >> >> at java.lang.reflect.Method.invoke(Method.java:597) > > >> >> at > > >> > > > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(SequenceFileLogWriter.java:228) > > >> >> ... 4 more > > >> >> > > >> >> > > >> >> I only have 15 map tasks each on a 10 machine cluster (total 150 > map > > >> tasks entering data into Hbase table). > > >> >> > > >> >> Further, I see 2-3 regions perpetually under "Regions in > Transitions" > > >> in Hbase master web console as follows: > > >> >> > > >> >> 8dcb3edee4e43faa3dbeac2db4f12274userTable1, > pookydearest@hotmail.com > > ,1351728961461.8dcb3edee4e43faa3dbeac2db4f12274. > > >> state=PENDING_OPEN, ts=Thu Nov 01 06:39:57 UTC 2012 (409s ago), > > >> server=smartdeals-hbase1-snc1.snc1,60020,1351751785514 > > >> >> > > >> >> > > >> >> bb91fd0c855e60dd4159e0ad3fd52cdauserTable1,m_skaare@yahoo.com > > ,1351728968936.bb91fd0c855e60dd4159e0ad3fd52cda. > > >> state=PENDING_OPEN, ts=Thu Nov 01 06:42:17 UTC 2012 (269s ago), > > >> server=smartdeals-hbase3-snc1.snc1,60020,1351747466016 > > >> >> bd44334a11464baf85013c97d673e600userTable1,tammikilgore@gmail.com > > ,1351728952308.bd44334a11464baf85013c97d673e600. > > >> state=PENDING_OPEN, ts=Thu Nov 01 06:42:17 UTC 2012 (269s ago), > > >> server=smartdeals-hbase1-snc1.snc1,60020,1351751785514 > > >> >> ed1f7e7908fc232f10d78dd1e796a5d7userTable1,jwoodel@triad.rr.com > > ,1351728971232.ed1f7e7908fc232f10d78dd1e796a5d7. > > >> state=PENDING_OPEN, ts=Thu Nov 01 06:37:37 UTC 2012 (549s ago), > > >> server=smartdeals-hbase3-snc1.snc1,60020,1351747466016 > > >> >> > > >> >> > > >> >> Note these are not going away even after 30 minutes. > > >> >> > > >> >> Further after running > > >> >> > > >> >> hbase hbck -summary I get following: > > >> >> > > >> >> Summary: > > >> >> -ROOT- is okay. > > >> >> Number of regions: 1 > > >> >> Deployed on: smartdeals-hbase7-snc1.snc1,60020,1351747458782 > > >> >> .META. is okay. > > >> >> Number of regions: 1 > > >> >> Deployed on: smartdeals-hbase7-snc1.snc1,60020,1351747458782 > > >> >> test1 is okay. > > >> >> Number of regions: 1 > > >> >> Deployed on: smartdeals-hbase2-snc1.snc1,60020,1351747457308 > > >> >> userTable1 is okay. > > >> >> Number of regions: 32 > > >> >> Deployed on: smartdeals-hbase10-snc1.snc1,60020,1351747456776 > > >> smartdeals-hbase2-snc1.snc1,60020,1351747457308 > > >> smartdeals-hbase4-snc1.snc1,60020,1351747455571 > > >> smartdeals-hbase5-snc1.snc1,60020,1351747458579 > > >> smartdeals-hbase6-snc1.snc1,60020,1351747458186 > > >> smartdeals-hbase7-snc1.snc1,60020,1351747458782 > > >> smartdeals-hbase8-snc1.snc1,60020,1351747459112 > > >> smartdeals-hbase9-snc1.snc1,60020,1351747455106 > > >> >> 24 inconsistencies detected. > > >> >> Status: INCONSISTENT > > >> >> > > >> >> In master logs I am seeing following error: > > >> >> > > >> >> ERROR org.apache.hadoop.hbase.master.AssignmentManager: Failed > > >> assignment in: smartdeals-hbase3-snc1.snc1,60020,1351747466016 due to > > >> >> > > >> > org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: > > >> Received:OPEN for the region:userTable1,m_skaare@yahoo.com > > ,1351728968936.bb91fd0c855e60dd4159e0ad3fd52cda. > > >> ,which we are already trying to OPEN. > > >> >> at > > >> > > > org.apache.hadoop.hbase.regionserver.HRegionServer.checkIfRegionInTransition(HRegionServer.java:2499) > > >> at > > >> > > > org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2457) > > >> at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) > > >> at > > >> > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > >> at java.lang.reflect.Method.invoke(Method.java:597) at > > >> > > > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) > > >> at > > >> > > > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1336) > > >> >> > > >> >> > > >> >> Am I missing something? How do I recover from this? How do I load > lot > > >> of data via MR into Hbase Tables? > > >> >> > > >> >> > > >> >> I am running under following setup: > > >> >> > > >> >> hadoop:2.0.0-cdh4.0.1 > > >> >> > > >> >> hbase: 0.92.1-cdh4.0.1, r > > >> >> > > >> >> > > >> >> Would greatly appreciate any help. > > >> >> > > >> >> > > >> >> Ameya > > >> >> > > >> >> > > >> >> > > >> >> > > >> > > >> > > >> > > >> -- > > >> > > >> Regards, > > >> Cheng Su > > >> > > > > > > > > -- > > > > Regards, > > Cheng Su > > > -- Kevin O'Dell Customer Operations Engineer, Cloudera --047d7b67820cbb66ed04cd6f17da--