hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5394) fix race conditions in DN caching and uncaching
Date Wed, 06 Nov 2013 03:25:19 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13814568#comment-13814568

Colin Patrick McCabe commented on HDFS-5394:

bq. Unused imports in FsDatasetImpl and FsVolumeImpl


bq. Do we still need to rename getExecutor to getCacheExecutor in FsVolumeImpl?

Well, the name of the variable is {{cacheExecutor}}; shouldn't the getter be {{getCacheExecutor}}?

bq. State#isUncaching() is unused


bq. Could use a core pool size of 0 for uncachingExecutor, I don't think it's that latency


bq. usedBytes javadoc: "more things to cache that we can't actually do because of" is an awkward
turn of phrase, maybe say "assign more blocks than we can actually cache because of" instead


bq. MappableBlock#load javadoc: visibleLeng parameter should be renamed to length. The return
value is now also a MappableBlock, not a boolean.


bq. Key: rename id to blockId for clarity? or add a bit of javadoc

added javadoc

bq. Naming the HashMap replicaMap is confusing since there's already a datanode ReplicaMap
class. Maybe mappableBlockMap instead?


bq. Caching can fail if the underlying block is invalidated in between getting the block's
filename and running the CacheTask. It'd be nice to distinguish this race from a real error
for when we do metrics (and also quash the exception).

I just added a catch block for the {{FileNotFound}} exception which both {{getBlockInputStream}}
and {{getMetaDataInputStream}} can throw.  I still think we want to log this exception, but
at INFO rather than WARN.  We will retry sending the {{DNA_CACHE}} command (once 5366 is committed),
so hitting this narrow race if a block is being moved is just a temporary setback.

bq. If we get a DNA_CACHE for a block that is currently being uncached, shouldn't we try to
cancel the uncache and re-cache it? The NN will resend the command, but it'd be better to
not have to wait for that.

We don't know how far along the uncaching process is.  We can't cancel it if we already called
{{munmap}}.  We could allow cancellation of pending uncaches by splitting {{UNCACHING}} into
{{UNCACHING_SCHEDULED}} and {{UNCACHING_IN_PROGRESS}}, and only allowing cancellation on the
former.  This might be a good improvement to make as part of 5182.  But for now, the uncaching
process is really quick, so let's keep it simple.

bq. Could this be written with value.state == State.CACHING_CANCELLED instead? Would be clearer,
and I believe equivalent since uncacheBlock won't set the state to UNCACHING if it's CACHING

well, if value is null, you don't want to be dereferencing that, right?

bq. Even better would be interrupting a CachingTask on uncache since it'll save us I/O and

That kind of interruption logic gets complex quickly.  I'd rather save that for a potential
performance improvement JIRA later down the line.  I also think that if we're thrashing (cancelling
caching requests right and left) the real fix might be on the NameNode anyway...

bq. Could we combine CACHING_CANCELLED into UNCACHING? It seems like CachingTask could check
for UNCACHING in that if statement at the end and uncache, same sort of change for uncacheBlock.

I would rather not do that, since right now we can look at entries in the map and instantly
know that anything in state {{UNCACHING}} has an associated {{Runnable}} scheduled in the
{{Executor}}.  cancelled is not really the same thing as uncaching since in the former case,
there is actually nothing to do!

bq. I think using a switch/case on the prevValue.state in uncacheBlock would be clearer


bq. 6,000,000 milliseconds seem like very long test timeouts  Can we change them to say, 60,000?

the general idea is to do stuff that can time out in {{GenericTestUtils#waitFor}} blocks.
 The waitFor blocks actually give useful backtraces and messages when they time out, unlike
the generic test timeouts.  I wanted to avoid the scenario where the test-level timeouts kick
in, but out of paranoia, I set the overall test timeout to 10 minutes in case there was some
other unexpected timeout.  I wanted to avoid the issues we've had with zombie tests in Jenkins
causing heisenfailures.

bq. Are these new log prints for sanity checking? Maybe we can just remove them.

it's more so you can see what's going on in the sea of log messages.  otherwise, it becomes
hard to debug.

bq. Some of the comments seem to refer to a previous patch version that used a countdown latch.


bq. It's unclear what this is testing beyond caching and then uncaching a bunch of blocks.
Can we check for log prints to see that it's actually cancelling as expected? Any other ideas
for definitively hitting cancellation?

we could add callback hooks to more points in the system, and set up a bunch of countdown
latches (or similar), but it might be overkill here.  I have looked at the logs and I do see
cancellation.  you'd have to hit a GC or something to avoid it in this test, I think

> fix race conditions in DN caching and uncaching
> -----------------------------------------------
>                 Key: HDFS-5394
>                 URL: https://issues.apache.org/jira/browse/HDFS-5394
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-5394-caching.001.patch, HDFS-5394-caching.002.patch, HDFS-5394-caching.003.patch,
HDFS-5394-caching.004.patch, HDFS-5394.005.patch, HDFS-5394.006.patch, HDFS-5394.007.patch
> The DN needs to handle situations where it is asked to cache the same replica more than
once.  (Currently, it can actually do two mmaps and mlocks.)  It also needs to handle the
situation where caching a replica is cancelled before said caching completes.

This message was sent by Atlassian JIRA

View raw message