hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign
Date Thu, 18 Oct 2018 21:50:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655931#comment-16655931
] 

stack commented on HBASE-21213:
-------------------------------

In HBASE-21307....

[~Apache9] says...
bq. One simple solution is just let it go, without any fencing, but I'm afraid there will
be other problem if we still keep scheduling MRP and SCP...
[~allan163] says...
bq. Should we consider remove the reference when bypassing, like the early solutions in HBASE-21213?

So, consider going back to original soln. to this issue for branch-2.1 and branch-2.0. Keep
an eye out and see if the final patch here causes more trouble than good.

> [hbck2] bypass leaves behind state in RegionStates when assign/unassign
> -----------------------------------------------------------------------
>
>                 Key: HBASE-21213
>                 URL: https://issues.apache.org/jira/browse/HBASE-21213
>             Project: HBase
>          Issue Type: Bug
>          Components: amv2, hbck2
>            Reporter: stack
>            Assignee: stack
>            Priority: Major
>             Fix For: 2.1.1
>
>         Attachments: HBASE-21213.branch-2.1.001.patch, HBASE-21213.branch-2.1.002.patch,
HBASE-21213.branch-2.1.003.patch, HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch,
HBASE-21213.branch-2.1.006.patch, HBASE-21213.branch-2.1.007.patch, HBASE-21213.branch-2.1.007.patch,
HBASE-21213.branch-2.1.008.patch, HBASE-21213.branch-2.1.009.patch, HBASE-21213.branch-2.1.010.patch,
HBASE-21213.branch-2.1.011.patch
>
>
> This is a follow-on from HBASE-21083 which added the 'bypass' functionality. On bypass,
there is more state to be cleared if we are allow new Procedures to be scheduled.
> For example, here is a bypass:
> {code}
> 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: pid=100449,
state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, bypass=LOG-REDACTED UnassignProcedure
table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664
bypassed, returning null to finish it
> 2018-09-20 05:45:44,022 INFO org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished
pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16,
server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec
> {code}
> ... but then when I try to assign the bypassed region later, I get this:
> {code}
> 2018-09-20 05:46:31,435 WARN org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure:
There is already another procedure running on this region this=pid=100450, state=RUNNABLE:REGION_TRANSITION_QUEUE,
locked=true; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16
owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace,
region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664
pid=100450, state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure table=hbase:namespace,
region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, location=ve1233.halxg.cloudera.com,22101,1537397961664
> 2018-09-20 05:46:31,510 INFO org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled
back pid=100450, state=ROLLEDBACK, exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException
via AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: There is
already another procedure running on this region this=pid=100450, state=RUNNABLE:REGION_TRANSITION_QUEUE,
locked=true; AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16
owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace,
region=37cc206fe9c4bc1c0a46a34c5f523d16, server=ve1233.halxg.cloudera.com,22101,1537397961664;
AssignProcedure table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 exec-time=473msec
> {code}
> ... which is a long-winded way of saying the Unassign Procedure still exists still in
RegionStateNodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message