hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jianwei Cui (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15433) SnapshotManager#restoreSnapshot not update table and region count quota correctly when encountering exception
Date Mon, 14 Mar 2016 10:13:33 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15193038#comment-15193038
] 

Jianwei Cui commented on HBASE-15433:
-------------------------------------

{quote}
When QEE is thrown we will still end up in updating the region quota which is not really required,
may be we can avoid that.
{quote}
Yes, we should catch QEE firstly and not update the quota information in such situation as
you suggested above.
{quote}
Also suggest to rename currentRegionCount to tableRegionCount and updatedRegionCount to snapshotRegionCount
for better understanding. Please add more comments like why are we doing this way.
{quote}
Good suggestions, will update the patch.

{quote}
If this throws exception then there will be another issue, because now the snapshot has been
successfully restored but in the catch clause we are updating the table region count in namespace
quota.
{quote}
Good find. Here, the {{checkAndUpdateNamespaceRegionQuota}} should succeed because it will
reduce the region count for the table? However, if the {{checkAndUpdateNamespaceRegionQuota}}
throws exception, there must be some unexpected reasons, and call {{checkAndUpdateNamespaceRegionQuota}}
in catch clause may also fail. We can log an error message in QEE catch clause and throw it
directly? And the code here can be updated as:
{code}
      int tableRegionCount = -1;
      try {
        // Table already exist. Check and update the region quota for this table namespace
        // Table is disabled, table region count won't change during restoreSnapshot
        tableRegionCount = getRegionCountOfTable(tableName);
        int snapshotRegionCount = manifest.getRegionManifestsMap().size();
        
        // Update region count before restoreSnapshot if snapshotRegionCount is larger. If
we
        // updated the region count to a smaller value before retoreSnapshot and the retoreSnapshot
        // fails, we may fail to reset the region count to its original value if the namespace
        // region count quota is consumed by other tables during the restoreSnapshot, such
as
        // region split or table create under the same namespace.
        if (tableRegionCount > 0 && tableRegionCount < snapshotRegionCount)
{
          checkAndUpdateNamespaceRegionQuota(snapshotRegionCount, tableName);
        }
        
        restoreSnapshot(snapshot, snapshotTableDesc);
        
        // Update the region count after restoreSnapshot succeeded if snapshotRegionCount
is
        // smaller. This step should not fail because it will reduce the region count for
table
        if (tableRegionCount > 0 && tableRegionCount > snapshotRegionCount)
{
          checkAndUpdateNamespaceRegionQuota(snapshotRegionCount, tableName);
        }
      } catch (QuotaExceededException e) {
        LOG.error("Exception occurred while restoring the snapshot " + snapshot.getName()
          + " as table " + tableName.getNameAsString(), e);
        // If QEE is thrown before restoreSnapshot, quota information is not updated, and
we
        // should throw the exception directly. If QEE is thrown after restoreSnapshot, there
        // must be unexpected reasons, we also throw the exception directly
        throw e;
      } catch (IOException e) {
        if (tableRegionCount > 0) {
          // reset region count for table
          checkAndUpdateNamespaceRegionQuota(tableRegionCount, tableName);
        }
        LOG.error("Exception occurred while restoring the snapshot " + snapshot.getName()
            + " as table " + tableName.getNameAsString(), e);
        throw e;
      }
{code}
What's your opinion about this issue? [~ashish singhi]

> SnapshotManager#restoreSnapshot not update table and region count quota correctly when
encountering exception
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-15433
>                 URL: https://issues.apache.org/jira/browse/HBASE-15433
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 2.0.0
>            Reporter: Jianwei Cui
>         Attachments: HBASE-15433-trunk-v1.patch, HBASE-15433-trunk-v2.patch, HBASE-15433-trunk.patch
>
>
> In SnapshotManager#restoreSnapshot, the table and region quota will be checked and updated
as:
> {code}
>       try {
>         // Table already exist. Check and update the region quota for this table namespace
>         checkAndUpdateNamespaceRegionQuota(manifest, tableName);
>         restoreSnapshot(snapshot, snapshotTableDesc);
>       } catch (IOException e) {
>         this.master.getMasterQuotaManager().removeTableFromNamespaceQuota(tableName);
>         LOG.error("Exception occurred while restoring the snapshot " + snapshot.getName()
>             + " as table " + tableName.getNameAsString(), e);
>         throw e;
>       }
> {code}
> The 'checkAndUpdateNamespaceRegionQuota' will fail if regions in the snapshot make the
region count quota exceeded, then, the table will be removed in the 'catch' block. This will
make the current table count and region count decrease, following table creation or region
split will succeed even if the actual quota is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message