hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xu Cang <xc...@salesforce.com>
Subject Re: +How can I abort pending procedures?
Date Mon, 10 Dec 2018 19:03:12 GMT
Hi Hossein,
If you are facing this issue for HBase branch-1. You could not use hbck2.
And be aware, some procedures are not abortable. The most practical
solution to your issue it to follow what Wellington mentioned above.
Removing the  "/hbase/MasterProcWALs" will remove all master procedures for
your cluster. I suggest you backing up this directory before removing. Then
you can failover active HMaster and you can retry creating tables you need.

Best,
Xu

On Mon, Dec 10, 2018 at 2:56 AM Wellington Chevreuil <
wellington.chevreuil@gmail.com> wrote:

> Hi Hossein, for which hbase version are you facing this issue?
> Removing "/hbase/MasterProcWALs" would probably help sort the
> mentioned error, but there might be some risk of creating other
> inconsistencies, depending on which procedures are running. Does
> list_procedures command show any "running" procedure, or just list the
> finished ones?
> Em seg, 10 de dez de 2018 às 02:39, Sakthi Vel
> <sakthivel.azhaku@gmail.com> escreveu:
> >
> > Hi Hossein,
> >
> > Aborting procedures can be dangerous (specially if the procedure is not
> > rolled back). AFAIK, you can use hbck2(apache/hbase-operator-tools) tool
> to
> > abort a procedure using the ('bypass')  option. I would like to quote the
> > official hbck2 doc here:
> >
> >  bypass [OPTIONS] <PID>...
> >    Options:
> >     -o,--override   override if procedure is running/stuck
> >     -r,--recursive  bypass parent and its children. SLOW! EXPENSIVE!
> >     -w,--lockWait   milliseconds to wait on lock before giving up;
> > default=1
> >    Pass one (or more) procedure 'pid's to skip to procedure finish.
> >    Parent of bypassed procedure will also be skipped to the finish.
> >    Entities will be left in an inconsistent state and will require
> >    manual fixup. May need Master restart to clear locks still held.
> >    Bypass fails if procedure has children. Add 'recursive' if all
> >    you have is a parent pid to finish parent and children. This
> >    is SLOW, and dangerous so use selectively. Does not always work.
> >
> > +Other members, please correct me if I am wrong.
> >
> > Sakthi
> >
> > On Sun, Dec 9, 2018 at 6:18 PM Hossein Zolfi <hossein.zolfi@gmail.com>
> > wrote:
> >
> > > Hi,
> > > I run hbase performance tools, and thousands tables have been created.
> And
> > > our cluster is currently in inconsistent state (We dont know what is
> the
> > > cause but we try found it), at first I try to disable/drop created
> tables
> > > (1700 tables) but nothing done. list_procedure show 492 rows, and It's
> not
> > > possible to abort any of them. Then, I restart hmaster service, but
> now, I
> > > got infinite number of following exceptions:
> > >
> > > 2018-12-09 20:01:30,194 WARN
> [MASTER_SERVER_OPERATIONS-master-4:16000-0]
> > > master.AssignmentManager: Failed assignment of
> > > t53889,00000000000000000007603345,1542715604227.4cc63591941dbe928663
> > > 88fbde075cac. to data-22-54,16020,1543392184445, waiting a little
> before
> > > trying on the same region server try=1 of 10
> > >
> org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException:
> > >
> org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException:
> > > Received OPEN for the region:t53889,0000000
> > > 0000000000007603345,1542715604227.4cc63591941dbe92866388fbde075cac. ,
> which
> > > we are already trying to CLOSE
> > >         at
> > >
> > >
> org.apache.hadoop.hbase.regionserver.RSRpcServices.openRegion(RSRpcServices.java:1604)
> > >         at
> > >
> > >
> org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:22239)
> > >         at
> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2196)
> > >         at
> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
> > >         at
> > >
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
> > >         at
> > > org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
> > >         at java.lang.Thread.run(Thread.java:748)
> > >
> > >         at
> sun.reflect.GeneratedConstructorAccessor10.newInstance(Unknown
> > > Source)
> > >         at
> > >
> > >
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> > >         at
> java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> > >         at
> > >
> > >
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
> > >         at
> > >
> > >
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
> > >         at
> > >
> > >
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:330)
> > >         at
> > >
> > >
> org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:772)
> > >         at
> > >
> > >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2164)
> > >         at
> > >
> > >
> org.apache.hadoop.hbase.master.AssignmentManager$2.process(AssignmentManager.java:860)
> > >         at
> > >
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
> > >         at
> > >
> > >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> > >         at
> > >
> > >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> > >         at java.lang.Thread.run(Thread.java:748)
> > >
> > > How can we stop such logs!?
> > >
> > > Output of `list_procedures` contains something like this:
> > >
> > > 1530 DisableTableProcedure (table=t2151) FINISHED Fri Dec 07 11:32:45
> +0330
> > > 2018 Sun Dec 09 20:07:59 +0330 2018
> > >
> > > 1532 DisableTableProcedure (table=t21514) FINISHED Fri Dec 07 11:42:53
> > > +0330 2018 Sun Dec 09 20:07:27 +0330 2018
> > >
> > > 1534 DisableTableProcedure (table=t21518) FINISHED Fri Dec 07 11:53:02
> > > +0330 2018 Sun Dec 09 20:07:57 +0330 2018
> > >
> > > 1535 DeleteTableProcedure (table=t13946) FINISHED Fri Dec 07 12:02:59
> +0330
> > > 2018 Sun Dec 09 20:07:27 +0330 2018
> > >
> > >
> > > I don't know if I remove /hbase/MasterProcWALs from hdfs will problem
> or
> > > not.
> > >
> > > Any help will be appreciated.
> > >
> > > With best regards.
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message