hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-1790) review use of synchronized methods in FSNamesystem
Date Sat, 26 Mar 2011 00:35:11 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matt Foley updated HDFS-1790:
-----------------------------

    Description: 
While analyzing SafeMode semantics for HDFS-1726, I noticed the following usage of synchronized
methods in FSNamesystem that raised questions:

1. incVolumeFailure() is synchronized on the FSNamesystem instance (monitor lock), for unclear
reasons.  Is this a left-over from the conversion to read/write lock in FSNamesystem?

2. isInSafeMode() and isPopulatingReplQueues() are also synchronized on the FSNamesystem instance
(monitor lock), but the SafeModeInfo methods they call (safeMode.isInSafeMode() and safeMode.isPopulatingReplQueues()
are synchronized on the SafeModeInfo instance.  Is synchronizing on the FSNamesystem instance
necessary?  What does it add?  If it is necessary, should it be using the read/write lock?

3. In SafeModeInfo, these methods are synchronized on the SafeModeInfo instance:
- isOn
- isPopulatingReplQueues
- leave
- initialize
- canInitializeReplQueues
- canLeave
- setBlock
- incrementSafeBlockCount
- decrementSafeBlockCount
- setManual
but these are not:
- enter
- needEnter
- checkMode
- isManual
- isConsistent

Regarding these:
- isOn() asserts isConsistent(), but is otherwise an atomic read operation.
- isPopulatingReplQueues() is an atomic read.  Does it need synchronization?
- leave() is complex, but shouldn't it be synchronized with enter(), which is also a write
operation?  Yet enter() is unsynchronized.
- initializeReplQueues() calls blockManager.processMisReplicatedBlocks(), which can take minutes
to run.  By synchronizing on the SafeModeInfo instance, it prevents essentially all of the
other safeMode methods from running for the duration!  Is this desirable or needed?
- canInitializeReplQueues() is a read-only operation, although it does compare two read values.
 needEnter() compares four read values, and is unsynchronized.  Does either need synchronization?
- canLeave() is a compound operation, it's good that it is synchronized.
- checkMode() is a big compound read/write operation.  Doesn't it need synchronization if
the other methods do?
and so on.


  was:
While analyzing SafeMode semantics for HDFS-1726, I noticed the following usage of synchronized
methods in FSNamesystem that raised questions:

1. incVolumeFailure() is synchronized on the FSNamesystem instance (monitor lock), for unclear
reasons.  Is this a left-over from the conversion to read/write lock in FSNamesystem?

2. isInSafeMode() and isPopulatingReplQueues() are also synchronized on the FSNamesystem instance
(monitor lock), but the SafeModeInfo methods they call (safeMode.isInSafeMode() and safeMode.isPopulatingReplQueues()
are synchronized on the SafeModeInfo instance.  Is synchronizing on the FSNamesystem instance
necessary?  What does it add?  If it is necessary, should it be using the read/write lock?

3. In SafeModeInfo, these methods are synchronized on the SafeModeInfo instance:
	isOn
	isPopulatingReplQueues
	leave
	initialize
	canInitializeReplQueues
	canLeave
	setBlock
	incrementSafeBlockCount
	decrementSafeBlockCount
	setManual
but these are not:
	enter
	needEnter
	checkMode
	isManual
	isConsistent

- isOn() asserts isConsistent(), but is otherwise an atomic read operation.
- isPopulatingReplQueues() is an atomic read.  Does it need synchronization?
- leave() is complex, but shouldn't it be synchronized with enter(), which is also a write
operation?  Yet enter() is unsynchronized.
- initializeReplQueues() calls blockManager.processMisReplicatedBlocks(), which can take minutes
to run.  By synchronizing on the SafeModeInfo instance, it prevents essentially all of the
other safeMode methods from running for the duration!  Is this desirable or needed?
- canInitializeReplQueues() is a read-only operation, although it does compare two read values.
 needEnter() compares four read values, and is unsynchronized.  Does either need synchronization?
- canLeave() is a compound operation, it's good that it is synchronized.
- checkMode() is a big compound read/write operation.  Doesn't it need synchronization if
the other methods do?
and so on.



> review use of synchronized methods in FSNamesystem
> --------------------------------------------------
>
>                 Key: HDFS-1790
>                 URL: https://issues.apache.org/jira/browse/HDFS-1790
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Matt Foley
>             Fix For: 0.23.0
>
>
> While analyzing SafeMode semantics for HDFS-1726, I noticed the following usage of synchronized
methods in FSNamesystem that raised questions:
> 1. incVolumeFailure() is synchronized on the FSNamesystem instance (monitor lock), for
unclear reasons.  Is this a left-over from the conversion to read/write lock in FSNamesystem?
> 2. isInSafeMode() and isPopulatingReplQueues() are also synchronized on the FSNamesystem
instance (monitor lock), but the SafeModeInfo methods they call (safeMode.isInSafeMode() and
safeMode.isPopulatingReplQueues() are synchronized on the SafeModeInfo instance.  Is synchronizing
on the FSNamesystem instance necessary?  What does it add?  If it is necessary, should it
be using the read/write lock?
> 3. In SafeModeInfo, these methods are synchronized on the SafeModeInfo instance:
> - isOn
> - isPopulatingReplQueues
> - leave
> - initialize
> - canInitializeReplQueues
> - canLeave
> - setBlock
> - incrementSafeBlockCount
> - decrementSafeBlockCount
> - setManual
> but these are not:
> - enter
> - needEnter
> - checkMode
> - isManual
> - isConsistent
> Regarding these:
> - isOn() asserts isConsistent(), but is otherwise an atomic read operation.
> - isPopulatingReplQueues() is an atomic read.  Does it need synchronization?
> - leave() is complex, but shouldn't it be synchronized with enter(), which is also a
write operation?  Yet enter() is unsynchronized.
> - initializeReplQueues() calls blockManager.processMisReplicatedBlocks(), which can take
minutes to run.  By synchronizing on the SafeModeInfo instance, it prevents essentially all
of the other safeMode methods from running for the duration!  Is this desirable or needed?
> - canInitializeReplQueues() is a read-only operation, although it does compare two read
values.  needEnter() compares four read values, and is unsynchronized.  Does either need synchronization?
> - canLeave() is a compound operation, it's good that it is synchronized.
> - checkMode() is a big compound read/write operation.  Doesn't it need synchronization
if the other methods do?
> and so on.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message