hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-2231) Compaction events should be written to HLog
Date Wed, 24 Apr 2013 02:04:16 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Enis Soztutar updated HBASE-2231:

    Attachment: hbase-2231_v5.patch

I have rebased Stack's v4 patch, and made some changes. 

>From my testing, it seems that IO Fencing for the WAL works (tested with HBASE-7878, not
HBASE-8389). [~stack] do you remember how did you test and conclude that it does not work?
This is pretty important, so I can dedicate some more time on testing it. 

Some of the changes from patch v4 to v5 are: 
 - Renamed Compaction -> CompactionDescriptor
 - Compaction output can be a list of files, not a single file. 
 - Added compaction WAL edit replay. Now we recognize and replay the compaction edit on region
open. This is needed because if we fail after we snyc() the wal, we might still try to delete
the files. 
 - Added 2 more tests for above condition. 
 - HLog does not know about compaction wal edit. 
 - Compaction wal edit log goes through the normal append code path. 

I've tried to capture the Fencing / Idempotency semantics in the following javadoc excerpt:

Compaction event should be idempotent, since there is no IO Fencing for
    the region directory in hdfs. A region server might still try to complete the
    compaction after it lost the region. That is why the following events are carefully
    ordered for a compaction:
     1. Compaction writes new files under region/.tmp directory (compaction output)
     2. Compaction atomically moves the temporary file under region directory
     3. Compaction appends a WAL edit containing the compaction input and output files.
     Forces sync on WAL.
     4. Compaction deletes the input files from the region directory.
    Failure conditions are handled like this:
     - If RS fails before 2, compaction wont complete. Even if RS lives on and finishes
     the compaction later, it will only write the new data file to the region directory.
     Since we already have this data, this will be idempotent but we will have a redundant
     copy of the data.
     - If RS fails between 2 and 3, the region will have a redundant copy of the data. The
     RS that failed won't be able to finish snyc() for WAL because of lease recovery in WAL.
     - If RS fails after 3, the region region server who opens the region will pick up the
     the compaction marker from the WAL and replay it by removing the compaction input files.
     Failed RS can also attempt to delete those files, but the operation will be idempotent

> Compaction events should be written to HLog
> -------------------------------------------
>                 Key: HBASE-2231
>                 URL: https://issues.apache.org/jira/browse/HBASE-2231
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>              Labels: moved_from_0_20_5
>             Fix For: 0.95.1
>         Attachments: 2231-testcase-0.94.txt, 2231-testcase_v2.txt, 2231-testcase_v3.txt,
2231v2.txt, 2231v3.txt, 2231v4.txt, hbase-2231-testcase.txt, hbase-2231.txt, hbase-2231_v5.patch
> The sequence for a compaction should look like this:
> # Compact region to "new" files
> # Write a "Compacted Region" entry to the HLog
> # Delete "old" files
> This deals with a case where the RS has paused between step 1 and 2 and the regions have
since been reassigned.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message