Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B869A9580 for ; Fri, 27 Jan 2012 05:44:46 +0000 (UTC) Received: (qmail 37014 invoked by uid 500); 27 Jan 2012 05:44:41 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 36223 invoked by uid 500); 27 Jan 2012 05:44:35 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 35923 invoked by uid 99); 27 Jan 2012 05:44:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jan 2012 05:44:14 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jan 2012 05:44:11 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 105DA165483 for ; Fri, 27 Jan 2012 05:43:51 +0000 (UTC) Date: Fri, 27 Jan 2012 05:43:51 +0000 (UTC) From: "Phabricator (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <929818190.84555.1327643031068.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1337615535.67679.1327343921005.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5259) Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194489#comment-13194489 ] Phabricator commented on HBASE-5259: ------------------------------------ tedyu has commented on the revision "[jira][HBASE-5259] Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.". INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:89 Map is reflected in the type of this field. My suggestion was only for your reference. src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:198 If NamingException is thrown out of line 201, line 202 would be skipped. Line 169 might be executed multiple times because regionServerAddress across multiple iterations may carry the same (unresolvable) value. Correct me if I am wrong. REVISION DETAIL https://reviews.facebook.net/D1413 > Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup. > --------------------------------------------------------------------------- > > Key: HBASE-5259 > URL: https://issues.apache.org/jira/browse/HBASE-5259 > Project: HBase > Issue Type: Improvement > Reporter: Liyin Tang > Assignee: Liyin Tang > Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch > > > Assuming the HBase and MapReduce running in the same cluster, the TableInputFormat is to override the split function which divides all the regions from one particular table into a series of mapper tasks. So each mapper task can process a region or one part of a region. Ideally, the mapper task should run on the same machine on which the region server hosts the corresponding region. That's the motivation that the TableInputFormat sets the RegionLocation so that the MapReduce framework can respect the node locality. > The code simply set the host name of the region server as the HRegionLocation. However, the host name of the region server may have different format with the host name of the task tracker (Mapper task). The task tracker always gets its hostname by the reverse DNS lookup. And the DNS service may return different host name format. For example, the host name of the region server is correctly set as a.b.c.d while the reverse DNS lookup may return a.b.c.d. (With an additional doc in the end). > So the solution is to set the RegionLocation by the reverse DNS lookup as well. No matter what host name format the DNS system is using, the TableInputFormat has the responsibility to keep the consistent host name format with the MapReduce framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira