hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matteo Bertozzi (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-8446) Allow parallel snapshot of different tables
Date Tue, 30 Apr 2013 10:48:16 GMT

     [ https://issues.apache.org/jira/browse/HBASE-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Matteo Bertozzi updated HBASE-8446:

    Attachment: HBASE-8446-v3.patch

{quote}Is this the correct place to remove handler references? (in #getTakeSnapshotHandler)?
Removing from the list seems like a unexpected side-effect for a getter. I'd imagine this
would be at the end of the TableSnasphotHandler#process call or in #completeSnapshot. This
introduces two problems – a concurrency problem (what if two getTakeSnapshotHandlers called)
and a resource leak problem (what if over time we create many snapshots – these handlers
never get gc'ed since they continue to live in the table..){quote}
Changed and renamed some stuff... now the logic looks like this
 * the in-progress handlers map is always accessed/changed under synchronized
 ** isSnapshotDone(), isRestoreDone(): Looks at the map for a pending handler and removes
it if finished
 ** isTakingSnapshot(), isRestoringTable(): Looks at the map in a read-only way
 ** snapshotTable(), restoreSnapshot(), cloneSnapshot(): Adds a new handler to the map

We can't remove the handlers until HMaster.isSnapshotDone()/HMaster.isRestoreDone() is called,
since we want to raise an exception if the snapshot/restore is failed. but if no one is calling
is*Done() we've the handlers pending forever... so now each snapshot/restore/clone operation
there's a cleanupSentinels() invoked that remove the completed ones after a specified timeout.

{quote}SnapshotManager lines 455. (delete working dir on failed snapshot) – will this affect
other concurrently table snapshots? Are they isolated? (Please add test.). Could one fail
on verification (if another is getting deleted?){quote}
Here we are removing the snapshot working dir... so only "snapshotName" is affected.
For the new test my guess is that TestFlushSnapshotFromClient.testConcurrentSnapshottingAttempts()
is already covering the failure scenario with concurrent snapshot. It execute N snapshots
on two different tables, some will fail and at the end we verify if there's at least one snapshot
per table.
> Allow parallel snapshot of different tables
> -------------------------------------------
>                 Key: HBASE-8446
>                 URL: https://issues.apache.org/jira/browse/HBASE-8446
>             Project: HBase
>          Issue Type: Improvement
>          Components: snapshots
>    Affects Versions: 0.95.0
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>             Fix For: 0.95.2
>         Attachments: HBASE-8446-94.patch, HBASE-8446-v0.patch, HBASE-8446-v1.patch, HBASE-8446-v2.patch,
> currently only one snapshot at the time is allowed.
> Like for the restore, we should allow taking snapshot of different tables in parallel.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message