hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-543) A region's state is kept in several places in the master opening the possibility for race conditions
Date Sun, 21 Dec 2008 07:32:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658359#action_12658359
] 

stack commented on HBASE-543:
-----------------------------

How do I know this fixes the issues?  There are no tests nor description of how this addresses
issues seen in referenced issues.  For example, it would be helpful if you explained how you
replicated said issues so could test this patch actually addresses them.

There is a bunch of redoing of state processing but no explanation as to why?  For example:

{code}
+   * Remove a region from the region state map.
+   * 
+   * @param info
+   */
+  public void removeRegion(HRegionInfo info) {
{code}

Under what circumstance would I remove a region from state map?

The region state map itself has no explanation:

{code}
+  // Needs to be SortedMap so we can specify a comparator
+  private final SortedMap<byte[], RegionState> regionState =
+    Collections.synchronizedSortedMap(
+        new TreeMap<byte[], RegionState>(Bytes.BYTES_COMPARATOR));
{code}

The maps it would replace tried to explain what they were about.

Nor does the new RegionState map.

With above said, looks like this could be an improvement in that state is all in one place.

Should RegionState be looking for illegal states?  It doesn't seem to do any checking.  This
would be a good place to check we're doing transitions properly.

Should resetting of connection root region be done inside unsetRootRegion in below so the
two actions are tied together:

{code}
+      master.connection.setRootRegionLocation(null);
+      master.regionManager.unsetRootRegion();
{code}

Does unsetRootRegion set root region to null in regionManger?  Maybe connection and regionManager
both need an unsetRootRegion method (or both a setRootRegionLocation that takes null) so same
action in two places uses similarily named methods (This stuff preexisted your patch).

At first I thought all these things unsafe:

{code}
+      for (RegionState s: regionsToAssign) {
{code}

.. but now I see your comment that there is a lock held higher up on regionsToAssign.... good.

Enough for now.

Good stuff.

> A region's state is kept in several places in the master opening the possibility for
race conditions
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-543
>                 URL: https://issues.apache.org/jira/browse/HBASE-543
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.1.0, 0.1.1, 0.2.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.19.0
>
>         Attachments: 543.patch, 543.patch, recent-changes.patch, regionstate.txt
>
>
> A region's state exists in multiple maps in the RegionManager: unassignedRegions, pendingRegions,
regionsToClose, closingRegions, regionsToDelete, etc.
> One of these race conditions was found in HBASE-534.
> For HBase-0.1.x, we should just patch the holes we find.
> The ultimate solution (which requires a lot of changes in HMaster) should be applied
to HBase trunk.
> Proposed solution:
> Create a class that encapsulates a region's state and provide synchronized access to
the class that validates state changes.
> There should be a single structure that holds regions in these transitional states and
it should be a synchronized collection of some kind.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message