hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Infinite loop ReopenTableRegionsProcedure
Date Fri, 05 Jul 2019 16:53:34 GMT
Looks like known issue. Can you update to later branch-2.0 release or patch
what you have?
Thanks,
S

On Thu, Jul 4, 2019 at 6:59 PM Jacobo Coll <jacobo167@gmail.com> wrote:

> Hi
>
> I'm trying to create a view against an existing hbase table, which adds
> some coprocessors to the table. I'm using Hortonworks 3.1.2, so it has
> HBase 2.0.2.3.1.2.1-1
> <
> https://repo.hortonworks.com/content/repositories/releases/org/apache/hbase/hbase-server/2.0.2.3.1.2.1-1/
> >
> and Phoenix 5.0.0.3.1.2.1-1
> <
> https://repo.hortonworks.com/content/repositories/releases/org/apache/phoenix/phoenix/5.0.0.3.1.2.1-1/
> >.
> The cluster is deployed in Azure.
>
> The issue seems to be related with HBASE-20817
> <https://issues.apache.org/jira/browse/HBASE-20817>, but should be fixed
> in
> that version (I've checked that this patch was applied to that build)
>
> Just after creating a "view" in phoenix, the "ModifyTableProcedure"
> triggers a "ReopenTableRegionsProcedure" that enters into this infinite
> loop of "MoveRegionProcedure". This loop has a lapse of ~5s, and it fills
> up the list of procedures, and the procedure wal is not cleanup, as it
> never finishes the running procedure.
>
> This is a subset of the hbase-master log. The affected table has a
> pre-split of 100, so the log is quite large. I've shrunken some lines with
> dots.
>
>
>
> 2019-07-03 16:12:27,924 INFO [PEWorker-8]
> procedure2.ProcedureExecutor: Initialized subprocedures=[{pid=267,
> ppid=266, state=RUNNABLE:REOPEN_TABLE_REGIONS_GET_REGIONS;
> ReopenTableRegionsProcedure table=opencga_jcoll_grch38_variants}]
> 2019-07-03 16:12:28,059 INFO [PEWorker-2]
> procedure2.ProcedureExecutor: Initialized subprocedures=[{pid=268,
> ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure
> hri=fdad9893526ef840d117e6bea7c04bc5,
> source=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960,
> destination=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960},
> {pid=269, ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN;
> MoveRegionProcedure hri=7b8b7dc99aee4f524af41a86e10ac945,
> source=wn0-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131867,
> destination=wn0-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131867},
>
> ....................................................................................................,
> {pid=368, ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN;
> MoveRegionProcedure hri=60ccd4513bc298b83d062cb0172ccba9,
> source=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960,
> destination=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960}]
> 2019-07-03 16:12:28,096 INFO  [PEWorker-5]
> procedure.MasterProcedureScheduler: Took xlock for pid=268, ppid=267,
> state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure
> hri=fdad9893526ef840d117e6bea7c04bc5,
> source=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960,
> destination=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> 2019-07-03 16:12:28,116 INFO [PEWorker-5]
> procedure2.ProcedureExecutor: Initialized subprocedures=[{pid=370,
> ppid=268, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure
> table=opencga_jcoll_grch38_variants,
> region=fdad9893526ef840d117e6bea7c04bc5, override=true,
> server=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960}]
> 2019-07-03 16:12:28,247 INFO [PEWorker-4]
> procedure.MasterProcedureScheduler: Took xlock for pid=370, ppid=268,
> state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure
> table=opencga_jcoll_grch38_variants,
> region=fdad9893526ef840d117e6bea7c04bc5, override=true,
> server=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> 2019-07-03 16:12:28,280 INFO [PEWorker-4]
> assignment.RegionTransitionProcedure: Dispatch pid=370, ppid=268,
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true;
> UnassignProcedure table=opencga_jcoll_grch38_variants,
> region=fdad9893526ef840d117e6bea7c04bc5, override=true,
> server=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> 2019-07-03 16:12:28,659 INFO [PEWorker-13]
> procedure2.ProcedureExecutor: Finished subprocedure pid=370, resume
> processing parent pid=268, ppid=267,
> state=RUNNABLE:MOVE_REGION_ASSIGN, locked=true; MoveRegionProcedure
> hri=fdad9893526ef840d117e6bea7c04bc5,
> source=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960,
> destination=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> 2019-07-03 16:12:28,659 INFO [PEWorker-13]
> procedure2.ProcedureExecutor: Finished pid=370, ppid=268,
> state=SUCCESS; UnassignProcedure table=opencga_jcoll_grch38_variants,
> region=fdad9893526ef840d117e6bea7c04bc5, override=true,
> server=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> in 458msec, unfinishedSiblingCount=0
> 2019-07-03 16:12:28,662 INFO [PEWorker-8]
> procedure2.ProcedureExecutor: Initialized subprocedures=[{pid=408,
> ppid=268, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure
> table=opencga_jcoll_grch38_variants,
> region=fdad9893526ef840d117e6bea7c04bc5,
> target=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960}]
> 2019-07-03 16:12:28,687 INFO [PEWorker-8]
> procedure.MasterProcedureScheduler: Took xlock for pid=408, ppid=268,
> state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure
> table=opencga_jcoll_grch38_variants,
> region=fdad9893526ef840d117e6bea7c04bc5,
> target=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> 2019-07-03 16:12:28,713 INFO [PEWorker-8] assignment.AssignProcedure:
> Starting pid=408, ppid=268, state=RUNNABLE:REGION_TRANSITION_QUEUE,
> locked=true; AssignProcedure table=opencga_jcoll_grch38_variants,
> region=fdad9893526ef840d117e6bea7c04bc5,
> target=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960;
> rit=OFFLINE, location=
> wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960;
> forceNewPlan=false, retain=false target
> svr=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> 2019-07-03 16:12:28,909 INFO [PEWorker-13]
> assignment.RegionTransitionProcedure: Dispatch pid=408, ppid=268,
> state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true;
> AssignProcedure table=opencga_jcoll_grch38_variants,
> region=fdad9893526ef840d117e6bea7c04bc5,
> target=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> 2019-07-03 16:12:29,435 INFO [PEWorker-13]
> procedure2.ProcedureExecutor: Finished subprocedure pid=408, resume
> processing parent pid=268, ppid=267, state=RUNNABLE, locked=true;
> MoveRegionProcedure hri=fdad9893526ef840d117e6bea7c04bc5,
> source=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960,
> destination=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> 2019-07-03 16:12:29,436 INFO [PEWorker-13]
> procedure2.ProcedureExecutor: Finished pid=408, ppid=268,
> state=SUCCESS; AssignProcedure table=opencga_jcoll_grch38_variants,
> region=fdad9893526ef840d117e6bea7c04bc5,
> target=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> in 684msec, unfinishedSiblingCount=0
> 2019-07-03 16:12:29,494 INFO [PEWorker-14]
> procedure2.ProcedureExecutor: Finished pid=268, ppid=267,
> state=SUCCESS; MoveRegionProcedure
> hri=fdad9893526ef840d117e6bea7c04bc5,
> source=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960,
> destination=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960
> in 1.3930sec, unfinishedSiblingCount=92
> 2019-07-03 16:12:36,744 INFO  [PEWorker-12]
> procedure2.ProcedureExecutor: Finished subprocedure pid=275, resume
> processing parent pid=267, ppid=266,
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED;
> ReopenTableRegionsProcedure table=opencga_jcoll_grch38_variants
> 2019-07-03 16:12:36,744 INFO  [PEWorker-12]
> procedure2.ProcedureExecutor: Finished pid=275, ppid=267,
> state=SUCCESS; MoveRegionProcedure
> hri=f552eccd01cfd00bc30bec5e19f398df,
> source=wn2-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131806,
> destination=wn2-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131806
> in 8.4340sec, unfinishedSiblingCount=0
> 2019-07-03 16:12:36,744 INFO  [PEWorker-12]
> procedure2.ProcedureExecutor: Finished subprocedure pid=275, resume
> processing parent pid=267, ppid=266,
> state=RUNNABLE:REOPEN_TABLE_REGIONS_CONFIRM_REOPENED;
> ReopenTableRegionsProcedure table=o
> pencga_jcoll_grch38_variants
> 2019-07-03 16:12:36,744 INFO  [PEWorker-12]
> procedure2.ProcedureExecutor: Finished pid=275, ppid=267,
> state=SUCCESS; MoveRegionProcedure
> hri=f552eccd01cfd00bc30bec5e19f398df,
> source=wn2-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131806,
> destination=wn2-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131806
> in 8.4340sec, unfinishedSiblingCount=0
> 2019-07-03 16:12:36,791 INFO  [PEWorker-5]
> procedure2.ProcedureExecutor: Initialized subprocedures=[{pid=571,
> ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure
> hri=fdad9893526ef840d117e6bea7c04bc5,
> source=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960,
> destination=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960},
> {pid=572, ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN;
> MoveRegionProced
> ure hri=7b8b7dc99aee4f524af41a86e10ac945,
> source=wn0-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131867,
> destination=wn0-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131867},
> {pid=573
> , ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure
> hri=545caf13911c04263c8f84f2c14783b7,
> source=wn4-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131742,
> destination=wn4-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131742},
> {pid=574, ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN;
> MoveRegionProcedure hri=6fd4397428a741d0fa67e1a2774f48d1,
> source=wn3-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131711,
> destination=wn3-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131711},
> {pid=575, ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN;
> MoveRegionProcedure hri=053f3ff2a77982f98bb399d60aa0942b,
> source=wn2-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131806,
> destination=wn2-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131806},
> {pid=576, ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN;
> MoveRegionProcedure hri=94415a23ead3e24367c12a0de1e90e28,
> source=wn4-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131742,
> destination=wn4-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131742},
> {pid=577, ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN;
> MoveRegionProcedure hri=eacf81f79287d01f721a352407d5a1a5,
> source=wn2-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131806,
> destination=wn2-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131806},
>
> ..................................................................................................................................,
> {pid=670, ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN;
> MoveRegionProcedure hri=941c5d8178257b7fc6bfa76b7d760468,
> source=wn4-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131742,
> destination=wn4-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131742},
> {pid=671, ppid=267, state=RUNNABLE:MOVE_REGION_UNASSIGN;
> MoveRegionProcedure hri=60ccd4513bc298b83d062cb0172ccba9,
> source=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960,
> destination=wn1-opencg.5w3ff4rocu0e1dpkokmkmgo5ib.zx.internal.cloudapp.net
> ,16020,1562169131960}]
>
>
> And then, the loop starts over
>
> The ReopenTableRegionsProcedure is stuck at
> REOPEN_TABLE_REGIONS_CONFIRM_REOPENED, where it starts over again, so,
> somehow, this should be relaed also with HBASE-20752
> <https://issues.apache.org/jira/browse/HBASE-20752>
>
>
> Any idea what's going on? Is it related with the above tickets?
>
>
> Thanks
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message