hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phabricator (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
Date Fri, 27 Jan 2012 17:22:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194927#comment-13194927

Phabricator commented on HBASE-5259:

tedyu has commented on the revision "[jira][HBASE-5259] Normalize the RegionLocation in TableInputFormat
by the reverse DNS lookup.".

  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 I am learning
about the possibilities of reverse DNS failure:


  I think we should be prepared for such occasion as I outlined @ 9:43pm.
  Just for your reference.
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 bq. this error
case which isn't supposed to happen
  If I understand the statement correctly, you didn't say 'definitely not possible'.

  My earlier analysis w.r.t. NamingException shows that we would incur extra delay in case
reverse DNS fails since the assignment on line 169 doesn't put the fall back value into cache.
  This can be regarded as performance regression compared to previous implementation where
reverse DNS is not taken into account.


> Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.
> ---------------------------------------------------------------------------
>                 Key: HBASE-5259
>                 URL: https://issues.apache.org/jira/browse/HBASE-5259
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>         Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, D1413.1.patch, D1413.2.patch,
D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch,
> Assuming the HBase and MapReduce running in the same cluster, the TableInputFormat is
to override the split function which divides all the regions from one particular table into
a series of mapper tasks. So each mapper task can process a region or one part of a region.
Ideally, the mapper task should run on the same machine on which the region server hosts the
corresponding region. That's the motivation that the TableInputFormat sets the RegionLocation
so that the MapReduce framework can respect the node locality. 
> The code simply set the host name of the region server as the HRegionLocation. However,
the host name of the region server may have different format with the host name of the task
tracker (Mapper task). The task tracker always gets its hostname by the reverse DNS lookup.
And the DNS service may return different host name format. For example, the host name of the
region server is correctly set as a.b.c.d while the reverse DNS lookup may return a.b.c.d.
(With an additional doc in the end).
> So the solution is to set the RegionLocation by the reverse DNS lookup as well. No matter
what host name format the DNS system is using, the TableInputFormat has the responsibility
to keep the consistent host name format with the MapReduce framework.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message