hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matteo Bertozzi (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-8446) Allow parallel snapshot of different tables
Date Tue, 30 Apr 2013 10:48:16 GMT

     [ https://issues.apache.org/jira/browse/HBASE-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matteo Bertozzi updated HBASE-8446:
-----------------------------------

    Attachment: HBASE-8446-v3.patch

{quote}Is this the correct place to remove handler references? (in #getTakeSnapshotHandler)?
Removing from the list seems like a unexpected side-effect for a getter. I'd imagine this
would be at the end of the TableSnasphotHandler#process call or in #completeSnapshot. This
introduces two problems – a concurrency problem (what if two getTakeSnapshotHandlers called)
and a resource leak problem (what if over time we create many snapshots – these handlers
never get gc'ed since they continue to live in the table..){quote}
Changed and renamed some stuff... now the logic looks like this
 * the in-progress handlers map is always accessed/changed under synchronized
 ** isSnapshotDone(), isRestoreDone(): Looks at the map for a pending handler and removes
it if finished
 ** isTakingSnapshot(), isRestoringTable(): Looks at the map in a read-only way
 ** snapshotTable(), restoreSnapshot(), cloneSnapshot(): Adds a new handler to the map

We can't remove the handlers until HMaster.isSnapshotDone()/HMaster.isRestoreDone() is called,
since we want to raise an exception if the snapshot/restore is failed. but if no one is calling
is*Done() we've the handlers pending forever... so now each snapshot/restore/clone operation
there's a cleanupSentinels() invoked that remove the completed ones after a specified timeout.

{quote}SnapshotManager lines 455. (delete working dir on failed snapshot) – will this affect
other concurrently table snapshots? Are they isolated? (Please add test.). Could one fail
on verification (if another is getting deleted?){quote}
Here we are removing the snapshot working dir... so only "snapshotName" is affected.
For the new test my guess is that TestFlushSnapshotFromClient.testConcurrentSnapshottingAttempts()
is already covering the failure scenario with concurrent snapshot. It execute N snapshots
on two different tables, some will fail and at the end we verify if there's at least one snapshot
per table.
                
> Allow parallel snapshot of different tables
> -------------------------------------------
>
>                 Key: HBASE-8446
>                 URL: https://issues.apache.org/jira/browse/HBASE-8446
>             Project: HBase
>          Issue Type: Improvement
>          Components: snapshots
>    Affects Versions: 0.95.0
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>             Fix For: 0.95.2
>
>         Attachments: HBASE-8446-94.patch, HBASE-8446-v0.patch, HBASE-8446-v1.patch, HBASE-8446-v2.patch,
HBASE-8446-v3.patch
>
>
> currently only one snapshot at the time is allowed.
> Like for the restore, we should allow taking snapshot of different tables in parallel.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message