hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2052) FSNamesystem should not sync the log with the write lock held
Date Tue, 21 Jun 2011 17:15:50 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052671#comment-13052671
] 

Eli Collins commented on HDFS-2052:
-----------------------------------

bq. Eli, the *Internal variant of method should not sync editlogs. It should be done from
the variant of the method that calls the *Internal method. Will that not solve the problem?

We would also have to modify startFileInternal to call deleteInternal instead of delete, but
that would drop the audit logging so probably easier to pass a boolean that indicates whether
to sync. 

Also, in deleteInternal I'm not sure why we sync after the delete() but before removeBlocks(),
seems like we could defer the sync to the end of the method.

bq. BTW I also think, in startFileInternal(), if we delete a file to overwrite and later we
fail to create the file, we should resurrect the file?

Agree, let's file a new bug.

> FSNamesystem should not sync the log with the write lock held
> -------------------------------------------------------------
>
>                 Key: HDFS-2052
>                 URL: https://issues.apache.org/jira/browse/HDFS-2052
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Eli Collins
>
> FSNamesystem#deleteInternal releases the write lock before syncing the log, however FSNamesystem#startFileInternal
calls delete -> deleteInternal with the write lock held, which means deleteInternal will
sync the log while holding the lock. We could fix cases like this by passing  a flag indicating
whether the function should sysnc (eg in this case the sysnc is not necessary because startFileInternals
callers will sync the log) or modify the current calls to sync to flag that a sync is necessary
before returning to the caller rather than doing the sync right at the call sight. This way
the cost of syncing the log could be amortized over multiple function calls (and potentially
multiple RPCs if we didn't mind introducing some synchronization).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message