Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 22 May 2013 01:04:21 +0000 (UTC)
From: "Matteo Bertozzi (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12641731.1365544391271.1426.1369184661124@arcas>
In-Reply-To: <JIRA.12641731.1365544391271@arcas>
References: <JIRA.12641731.1365544391271@arcas>
Subject: [jira] [Commented] (HBASE-8310) HBase snapshot timeout default
 values and TableLockManger timeout
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663645#comment-13663645 ] 

Matteo Bertozzi commented on HBASE-8310:
----------------------------------------

-1 on the patch

{quote}We are more relying on the table lock at the moment to prevent concurrent snapshot of the same table.{quote}
No. The table lock is not involved with the "Reject snapshot" exception.
The synchronized block cover the fact that a single snapshot at the time can be enqueued, and the snapshotHandler is checked before performing a snapshotTable() operation. isSnapshotInProgress() or isRestoreInProgress() are the ones that throws the "rejecting snapshot exception" not the table lock.

Take a look a SnapshotManager.prepareToTakeSnapshot() to see where the exception is thrown.
                
> HBase snapshot timeout default values and TableLockManger timeout
> -----------------------------------------------------------------
>
>                 Key: HBASE-8310
>                 URL: https://issues.apache.org/jira/browse/HBASE-8310
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 0.95.0
>            Reporter: Jerry He
>            Assignee: Jerry He
>            Priority: Minor
>             Fix For: 0.98.0, 0.95.2, 0.94.9
>
>         Attachments: trunk.patch
>
>
> There are a few timeout values and defaults being used by HBase snapshot.
> DEFAULT_MAX_WAIT_TIME (60000 milli sec, 1 min) for client response
> TIMEOUT_MILLIS_DEFAULT (60000 milli sec, 1 min) for Procedure timeout
> SNAPSHOT_TIMEOUT_MILLIS_DEFAULT (60000 milli sec, 1 min) for region server subprocedure  
> There is also other timeout involved, for example, 
> DEFAULT_TABLE_WRITE_LOCK_TIMEOUT_MS (10 mins) for TakeSnapshotHandler#prepare()
> We could have this case:
> The user issues a sync snapshot request, waits for 1 min, and gets an exception.
> In the meantime the snapshot handler is blocked on the table lock, and the snapshot may continue to finish after 10 mins.
> But the user will probably re-issue the snapshot request during the 10 mins.
> This is a little confusing and messy when this happens.
> To be more reasonable, we should either increase the DEFAULT_MAX_WAIT_TIME or decrease the table lock waiting time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira