hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5366) recaching improvements
Date Wed, 16 Oct 2013 01:54:42 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13796322#comment-13796322
] 

Colin Patrick McCabe commented on HDFS-5366:
--------------------------------------------

The other question that came up in discussion on HDFS-5096 is whether we should have a dedicated
thread (independent of the {{CacheReplicationMonitor}} thread) which periodically re-examines
the outstanding cache and uncache requests, and reschedules them to a different node if they
aren't fulfilled.  I've thought about this, but I'm not sure that we need it.

The problem is that both caching and uncaching take time.  Caching takes time because it involves
reading from disk.  Uncaching takes time because a client might have an mmap that needs to
be revoked.  The involuntary revocation period will be at least 5 minutes, to avoid having
clients burned by GCs.

if we're too aggressive about rescheduling our cache/uncache operations, we may create a lot
of churn.  If the period of such a "rescheduler thread" would be measured in minutes, isn't
it simpler to just use the rescanning thread to handle this scenario?

The other problem is that we currently rely on the {{DatanodeManager}} to tell us when a node
is bad.  Its timeouts are generous (10.5 minutes by default to declare a node dead), so the
proposed "rescheduler" would either have to maintain its own list of who is naughty and nice,
or have a really long period (again overlapping with the rescanner thread).  I don't really
want to duplicate the deadNodes list...

I do think we should resend the DNA_CACHE, etc. as I mentioned above.  Networks do lose messages,
after all.  But we might have to assume that if a DN tells us it can cache X bytes, that it's
telling the truth.  Otherwise, the failure cases we have to think about tend to proliferate.

> recaching improvements
> ----------------------
>
>                 Key: HDFS-5366
>                 URL: https://issues.apache.org/jira/browse/HDFS-5366
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: HDFS-4949
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>
> There are a few things about our HDFS-4949 recaching strategy that could be improved.
> * We should monitor the DN's maximum and current mlock'ed memory consumption levels,
so that we don't ask the DN to do stuff it can't.
> * We should not try to initiate caching on stale DataNodes (although we should not recache
things stored on such nodes until they're declared dead).
> * We might want to resend the {{DNA_CACHE}} or {{DNA_UNCACHE}} command a few times before
giving up.  Currently, we only send it once.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message