hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dick King (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1222) [Mumak] We should not include nodes with numeric ips in cluster topology.
Date Tue, 01 Dec 2009 00:30:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783962#action_12783962
] 

Dick King commented on MAPREDUCE-1222:
--------------------------------------

After I wrote my comment of 24/Nov/09 07:59 PM , I looked at the Java API because I came to
wonder whether unescaping and using the Java API could be made to work by itself.  I did look
for alternatives before I created my big regular expression.

The big problem is that Java doesn't really present any API that distinguishes numeric IP
addresses from symbolic addresses.  Although InetAddress.getByName(String) must have some
means of parsing an IPV4 and IPV6 literal numeric address, this functionality is not presented
to java.net.* users.  InetAddress.getByName(String) will parse either a numeric address or
a symbolic name and produce indistinguishable results.  That piece of the API does not give
us a means to distinguish the two.  I was unable to find any other API that did make the distinction.

The formats of numeric literal IPV4 and IPV6 internet addresses are fixed in RFCs and are
extremely unlikely to be changed in the foreseeable future.  We are therefore not exposed
to any non-future-proofing.  The only exposure we have is a possible future IPV8, but the
ICANN is doing its best to make that unnecessary for a very long time.

Considering that Apache already owns this regular expression we should consider using it.

I considered the simpler approach of considering any address that contains a colon character
to be a numeric IPV6 address, but colons are used as other punctuation, ie., separation between
IP address and port number.  That solution felt to me to be too brittle and accident-prone,
and doesn't solve the IPV8 problem.  There is a continuum of IPV6 solutions ranging from "look
for a colon" to the correct regular expression you see here, and no principled way to decide
where to stop.

> [Mumak] We should not include nodes with numeric ips in cluster topology.
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1222
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1222
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/mumak
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Hong Tang
>            Assignee: Hong Tang
>             Fix For: 0.21.0, 0.22.0
>
>         Attachments: IPv6-predicate.patch, mapreduce-1222-20091119.patch, mapreduce-1222-20091121.patch
>
>
> Rumen infers cluster topology by parsing input split locations from job history logs.
Due to HDFS-778, a cluster node may appear both as a numeric ip or as a host name in job history
logs. We should exclude nodes appeared as numeric ips in cluster toplogy when we run mumak
until a solution is found so that numeric ips would never appear in input split locations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message