hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1776) [hbase] Fix for sporadic compaction failures closing and moving compaction result
Date Fri, 24 Aug 2007 06:52:30 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HADOOP-1776:
--------------------------

    Attachment: fixes.patch

Turns out it was possible another thread of control on occasion (on a cluster) could remove
the parent compaction working directory out from under an ongoing compaction.  Thanks to Raghu
over in HADOOP-1765 for making th key observation that helped figure the cause of erratic
compaction behaviors.  Here's a patch that redoes how compactions are cleaned up.

Below is commit comment:

{code}
HADOOP-1776 Fix for sporadic compaction failures closing and moving compaction
result

M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConnectionManager.java
    Minor fix of a log message.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
    (COMPACTION_DIR, WORKING_COMPACTION): Removed.
    (compactdir): Renamed compactionDir.
    Removed from constructor our checking if a compaction was left undone.
    Instead, just ignore it.  When compaction reruns whatever as left on 
    filesystem will just be cleaned up and we'll rerun the compaction 
    (Likelihood of a crash mid-compaction in exactly the area where
    the compaction was recoverable are low -- more robust just redoing
    the compaction from scratch).
    (compactHelper): We were deleting HBaseRoot/compaction.tmp dir
    after a compaction completed. Usually fine but on a cluster of
    more than one machine, if two compactions were near-concurrent, one
    machine could remove the compaction working directory while another
    was mid-way through its compaction.  Result was odd failures
    during compaction of result file, during the move of the resulting
    compacting file or subsequently trying to open reader on the
    resulting compaction file (See HADOOP-1765).
    a region fsck tool).
    (getFilesToCompact): Added.
    (processReadyCompaction): Added.  Reorganized compaction so that the
    window during which loss-of-data is possible is narrowed and even
    then, we log a message with how a restore might be performed manually
    (TODO: Add a repair tool).
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java
    (rename): More checking around rename that it was successful.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java
    An empty-log gives HLog trouble.  Added handling.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Cleanup of debug level logging.
M  src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    Minor javadoc and changed a log from info to debug.
{code}


> [hbase] Fix for sporadic compaction failures closing and moving compaction result
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-1776
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1776
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.15.0
>
>         Attachments: fixes.patch
>
>
> Compactions are sporadically throwing IOExceptions on close of the compacted file, FileNotFoundExceptions
moving compacted files or subsequently trying to open the compacted file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message