Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8366C17C9E for ; Wed, 4 Feb 2015 16:50:17 +0000 (UTC) Received: (qmail 44695 invoked by uid 500); 4 Feb 2015 16:50:15 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 44621 invoked by uid 500); 4 Feb 2015 16:50:14 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Delivered-To: moderator for dev@hbase.apache.org Received: (qmail 51732 invoked by uid 99); 4 Feb 2015 07:12:16 -0000 X-ASF-Spam-Status: No, hits=2.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yeweichen2010@gmail.com designates 209.85.192.43 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=aGCbIEYC42PUAqCyotZW0WKIhaBXsR/C0wROPBhvt9o=; b=yl7lZYgZWVnAexqaYP+thuSaBcRbhhRZMQ8lZgJbBckXkCWSof9029LDif6oL/5H4W xZYG+9Qvp4k8KobVUvKD+PaB+TQAUbcTvcrfRtdEidw9aDQIpv4AfTxJlW/YroIWqhLR DUgMNjAWcSBhsmqWuFUJbj7dXUcpSnaX6V//Xj9T7i6yOt/HdZoYQ49cFlDCNClyBfMH 2Rl5SzTNXQ7W4ke30qeQeOAmnCyFcdyCzufrlQLeZ/dmSocxFgdtLElbuebsPGBSG4Au ogBmtI5IWkov4F0iIuHwfN/s35aoqQpDp97rbPoeydZBWkKRPm73gZvj9S/fBczcwMxQ J4QA== MIME-Version: 1.0 X-Received: by 10.140.32.166 with SMTP id h35mr12305653qgh.31.1423033866243; Tue, 03 Feb 2015 23:11:06 -0800 (PST) In-Reply-To: References: <1liaiali9gs6i79k229ic496.1423002921954@email.android.com> Date: Wed, 4 Feb 2015 15:11:06 +0800 Message-ID: Subject: Re: Re: Re: Wrong Configuration lead to a failure when enabling table From: Weichen YE To: Ted Yu Cc: "user@hbase.apache.org" , "dev@hbase.apache.org" , =?UTF-8?B?5Y+254Kc5pmo?= Content-Type: multipart/alternative; boundary=001a11397ec019d7ef050e3de411 X-Virus-Checked: Checked by ClamAV on apache.org --001a11397ec019d7ef050e3de411 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi=EF=BC=8CTed, Ram, Thank you for your attemtion for this bug. I meet this bug in production environment and the table contains important data. If we are not able to enable this table in current cluster, do you have any idea to get the table data back in some other way? Maybe export, snapshot, copytable, distcp all table files to another cluster ? 2015-02-04 13:17 GMT+08:00 Ted Yu : > Looks like the NPE was caused by the following method in BaseLoadBalancer > returning null: > > protected Map> assignMasterRegions( > > Collection regions, List servers) { > > if (servers =3D=3D null || regions =3D=3D null || regions.isEmpty()) = { > > return null; > > Since bulkPlan is null, calling BulkAssigner seems unnecessary. > > > > On Tue, Feb 3, 2015 at 9:01 PM, ramkrishna vasudevan < > ramkrishna.s.vasudevan@gmail.com> wrote: > >> It is not only about the state on the table descriptor but also the in >> memory state in the AM. I remember some time back Rajeshbabu worked on = a >> HBCK like tool which will forcefully change the state of these tables in >> such cases. I don't remember the JIRA now.I thought of restarting the >> master thinking the in memory state would change and I got this >> >> java.lang.NullPointerException >> at >> >> org.apache.hadoop.hbase.master.handler.EnableTableHandler.handleEnableTa= ble(EnableTableHandler.java:210) >> at >> >> org.apache.hadoop.hbase.master.handler.EnableTableHandler.process(Enable= TableHandler.java:142) >> at >> >> org.apache.hadoop.hbase.master.AssignmentManager.recoverTableInEnablingS= tate(AssignmentManager.java:1695) >> at >> >> org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentM= anager.java:416) >> at >> >> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(= HMaster.java:720) >> at >> org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:170) >> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:145= 9) >> at java.lang.Thread.run(Thread.java:745) >> 2015-02-04 16:11:45,932 FATAL [stobdtserver3:16040.activeMasterManager] >> master.HMaster: Master server abort: loaded coprocessors are: >> [org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint] >> 2015-02-04 16:11:45,933 FATAL [stobdtserver3:16040.activeMasterManager] >> master.HMaster: Unhandled exception. Starting shutdown. >> java.lang.NullPointerException >> at >> >> org.apache.hadoop.hbase.master.handler.EnableTableHandler.handleEnableTa= ble(EnableTableHandler.java:210) >> at >> >> org.apache.hadoop.hbase.master.handler.EnableTableHandler.process(Enable= TableHandler.java:142) >> at >> >> org.apache.hadoop.hbase.master.AssignmentManager.recoverTableInEnablingS= tate(AssignmentManager.java:1695) >> at >> >> org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentM= anager.java:416) >> at >> >> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(= HMaster.java:720) >> at >> org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:170) >> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:145= 9) >> >> >> Regards >> Ram >> >> On Wed, Feb 4, 2015 at 10:25 AM, Ted Yu wrote: >> >> > What about creating an offline tool which can modify the table >> descriptor >> > so that table goes to designated state ? >> > >> > Cheers >> > >> > On Tue, Feb 3, 2015 at 8:51 PM, ramkrishna vasudevan < >> > ramkrishna.s.vasudevan@gmail.com> wrote: >> > >> > > I tried reproducing this scenario on trunk. The same problem exists. >> > > Currently in the master the table state is noted in the Table >> descriptor >> > > and not on the ZK. In 0.98.XX version it should be on the zk. >> > > >> > > When we tried to enable the table the region assignment failed due t= o >> > > ClassNotFound and already the state is in ENABLING. But doing a >> describe >> > > table still shows it in DISABLED. >> > > >> > > Thought we could alter the correct Configuration but specifying >> another >> > > alter Table command we are still not able to enable the table. >> > > >> > > Moving this to dev to see if there is any workaround for this issue. >> If >> > > not we may have to solve this issue across branches until we have th= e >> > > Procedure V2 implemenation ready on trunk. >> > > >> > > Any suggestions? >> > > >> > > Regards >> > > Ram >> > > >> > > On Wed, Feb 4, 2015 at 4:05 AM, =E5=8F=B6=E7=82=9C=E6=99=A8 wrote: >> > > >> > > > my version is 0.98.6-cdh5.2.0, the problem in my production >> > environment. >> > > > >> > > > So should I first delete znode? And then how to distable this >> table=EF=BC=9Fmy >> > > > goal is to fix the wrong table configuration to get my data. >> > > > >> > > > >> > > > from my mobile phone. >> > > > >> > > > =E5=9C=A8 2015-2-4 =E4=B8=8A=E5=8D=8812:46=EF=BC=8Cramkrishna vas= udevan < >> > > ramkrishna.s.vasudevan@gmail.com >> > > > >=E5=86=99=E9=81=93=EF=BC=9A >> > > > >> > > > > >> > > > > I think the only way out here is to clear the zookeeper node. >> But am >> > > > not sure on the ramifications of that. >> > > > > >> > > > >> > > > > Which version are you using? The newer versions are >> 'protobuf'fed. >> > > > > >> > > > >> > > > > Are you running this in production? >> > > > > >> > > > >> > > > > Regards >> > > > > Ram >> > > > > >> > > > >> > > > > On Tue, Feb 3, 2015 at 5:00 PM,yeweichen2010@gmail.com< >> > > > yeweichen2010@gmail.com>wrote: >> > > > >> > > > >> >> > > > >> I tried HBCK, but it doesn`t help. >> > > > >> > > > I want to disable the table, so that I can use "alter" to fix the >> wrong >> > > > configuration. But now the table keep in the status that no matter= I >> > use >> > > > "is_enabled" or "is_disabled", it return false. >> > > > >> >> > > > >> ________________________________ >> > > > >> yeweichen2010@gmail.com >> > > > >>> >> > > > >>> >> > > > >>> From: ramkrishna vasudevan >> > > > >>> Date: 2015-02-03 19:55 >> > > > >>> To: user@hbase.apache.org >> > > > >>> CC: yeweichen >> > > > >>> Subject: Re: Wrong Configuration lead to a failure when enabli= ng >> > > table >> > > > >>> Can you try HBCK? Did it help in anyway? Remember something w= as >> > done >> > > > >>> related to failure in ENABLE/DISABLE table some time back. >> > > > >>> >> > > > >>> Regards >> > > > >>> Ram >> > > > >>> >> > > > >>> On Tue, Feb 3, 2015 at 3:38 PM,yeweichen2010@gmail.com< >> > > > >>> yeweichen2010@gmail.com> wrote: >> > > > >>> >> > > > >>> > Hi, all, >> > > > >>> > >> > > > >>> > II did the following command in hbase shell: >> > > > >>> > >> > > > >>> > disable 'TestTable' >> > > > >>> > alter 'TestTable', CONFIGURATION =3D> >> > > > >>> > {'hbase.regionserver.region.split.policy' =3D> 'xxxxxxxxx'} >> > > > >>> > enable 'TestTable' >> > > > >>> > >> > > > >>> > At first I want to put >> > > > >>> > >> > > "org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy" >> > > > to the >> > > > >>> > place "xxxxxxxxx", but because a spelling error, now is >> something >> > > > wrong in >> > > > >>> > this configuration. After I enable the table, it failed >> bacause >> > of >> > > > >>> > ClassNotFound. >> > > > >>> > >> > > > >>> > Now is the problem: the table failed to enable and stay in a >> > middle >> > > > >>> > status. The table is neither enabled nor disabled now. How >> can I >> > > > save my >> > > > >>> > table and fix the wrong configuration? >> > > > >>> > >> > > > >>> > >> > > > >>> > >> > > > >>> > >> > > > >>> > >> > > > >>> >yeweichen2010@gmail.com >> > > > >>> > >> > > > > >> > > > > >> > > > >> > > >> > >> > > --001a11397ec019d7ef050e3de411--