hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2052) FSNamesystem should not sync the log with the write lock held
Date Fri, 24 Jun 2011 21:31:47 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054688#comment-13054688
] 

Suresh Srinivas commented on HDFS-2052:
---------------------------------------

FSDirectory#delete()
{noformat}
    incrDeletedFileCount(filesRemoved);
    // Blocks will be deleted later by the caller of this method
    getFSNamesystem().removePathAndBlocks(src, null);
--> fsImage.getEditLog().logDelete(src, now);
    return true;
{noformat}

The above code adds the editlog to a buffer.

getEditLog().logSync() in FSNamesysterm#deleteInternal() waits for the transaction to be synced
to editlog.

So in this case two editlog operations are deposited into editlog buffer. One call to logSync()
in startFileInternal() will ensure both of them are logged.

So I am not sure what you mean by it would drop audit logging?


> FSNamesystem should not sync the log with the write lock held
> -------------------------------------------------------------
>
>                 Key: HDFS-2052
>                 URL: https://issues.apache.org/jira/browse/HDFS-2052
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Eli Collins
>
> FSNamesystem#deleteInternal releases the write lock before syncing the log, however FSNamesystem#startFileInternal
calls delete -> deleteInternal with the write lock held, which means deleteInternal will
sync the log while holding the lock. We could fix cases like this by passing  a flag indicating
whether the function should sysnc (eg in this case the sysnc is not necessary because startFileInternals
callers will sync the log) or modify the current calls to sync to flag that a sync is necessary
before returning to the caller rather than doing the sync right at the call sight. This way
the cost of syncing the log could be amortized over multiple function calls (and potentially
multiple RPCs if we didn't mind introducing some synchronization).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message