hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8047) CachedDNSToSwitchMapping caches negative results forever
Date Wed, 22 Feb 2012 15:35:49 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213708#comment-13213708

Steve Loughran commented on HADOOP-8047:

you can set the JVM up not to cache -ve DNS entries; it used to do it forever (bad, bad sun),
but this is a separate issue.

what the switch mapping is doing is caching the mapping to a specific switch, so if the topology
script is updated as new machines are added, if somehow those new machines report in before
the script is updated, the default rack settings get cached.

for extra fun, every switch mapping is wrapped by a (completely gratuitious) cache. I say
gratuitous as the script mapper does its own caching, and other implementations are likely
to be more dynamic (will need to check with people who have written their own mapper, obviously).
Because of that wrapping, no dynamic topology scripts will ever have their changes picked
> CachedDNSToSwitchMapping caches negative results forever
> --------------------------------------------------------
>                 Key: HADOOP-8047
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8047
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.23.0, 0.24.0, 1.0.0
>            Reporter: Steve Loughran
>            Priority: Trivial
> This is very minor, just worth filing in JIRA unless someone wants to rethink topology
caching for a dynamic world.
> # The CachedDNSToSwitchMapping caches the results from all relayed DNS queries.
> # The DNS script mapper returns the default rack for all unknown entries (or when the
script fails)
> # The Cache stores this in its map and never re-resolves it.
> As a result, if a node is added to a live cluster that the existing script cannot resolve,
then it won't get assigned to a rack unless the script is updated before the rack map is resolved.

> This isn't usually that important, it just means "update your scripts before adding new
racks". Perhaps there should be a page on that activity, "runbook and checklist for adding
new servers and racks".
> Where it would matter if anyone started playing with dynamic topologies, but in that
situation the cached mapping itself would become the liability, as it assumes that servers
never switch switches in a live system: the topology is static for existing nodes. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message