hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Robertson <timrobertson...@gmail.com>
Subject Re: MR not seeing data locality - IP versus Host name
Date Mon, 28 May 2012 15:02:23 GMT
Thanks Stack... you nailed it.

I was launching this from my laptop over VPN, and for some reason the
reverse DNS did not work.  For anyone who stumbles upon this thread, the
following demonstrates the issue:

    InetSocketAddress i = new InetSocketAddress("130.226.238.181", 8080);

    InetSocketAddress i2 = new InetSocketAddress("c4n1.gbif.org", 8080);

    System.out.println(i.getAddress().getHostName());

    System.out.println(i2.getAddress().getHostName());

On the cluster machines:
  c4n1.gbif.org
  c4n1.gbif.org

On my laptop:
  130.226.238.181
  c4n1.gbif.org

Cheers,
Tim




On Mon, May 28, 2012 at 3:54 PM, Tim Robertson <timrobertson100@gmail.com>wrote:

> Thanks Stack.  We're looking into this a lot.
>
> As far as we can tell DNS is correct, machine host names are correct etc.
> In .META. it uses fully qualified names (c4n5.gbif.org) so I guess I'll
> start looking at the job launching machine.
>
> The code you link to is quite different to the TableInputFormatBase in
> CDH3u3.  I actually patched that with the following to verify to myself it
> would work, and it did indeed work (got a blog about the performance which
> you'll like):
>
>       // patch the possible GBIF DNS issue - TT report differing things
> to split locations
>
>       // Task attempts show as /default-rack/c4n2.gbif.org
>
>       // splits are coming in as /default-rack/130.226.238.182
>
>       regionLocation = regionLocation.replaceAll("130.226.238.181", "
> c4n1.gbif.org");
>
>       regionLocation = regionLocation.replaceAll("130.226.238.182", "
> c4n2.gbif.org");
>
>       regionLocation = regionLocation.replaceAll("130.226.238.183", "
> c4n3.gbif.org");
>
>       regionLocation = regionLocation.replaceAll("130.226.238.184", "
> c4n4.gbif.org");
>
>       regionLocation = regionLocation.replaceAll("130.226.238.185", "
> c4n5.gbif.org");
>
>       regionLocation = regionLocation.replaceAll("130.226.238.186", "
> c4n6.gbif.org");
>
> More when we know more.
> Tim
>
>
> On Mon, May 28, 2012 at 12:32 AM, Stack <stack@duboce.net> wrote:
>
>> On Sun, May 27, 2012 at 1:05 PM, Tim Robertson
>> <timrobertson100@gmail.com> wrote:
>> > Hi all,
>> >
>> > When I run MR jobs, I don't see data locality because the TT sees
>> > /default-rack/c4n1.gbif.org but the TableInputFormat is
>> > giving /default-rack/130.226.238.181 (the same machine) when it
>> determines
>> > the splits for the job.
>>
>> Its doing this Tim:
>>
>>
>> http://hbase.apache.org/xref/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.html#145
>>
>> On the machine launching the job, its asking what the region location
>> is.  What is in .META. table?  Names or IPs?  If former, then its the
>> resolve on the machine launching the job that is mangling it (DNS
>> falls back to IP if problem figuring name).  Can you mess w/ the DNS
>> on the machine that is launching the job?  See if you can find issue
>> in its DNS (This is 0.90.X?  If so, does its forward and back DNS give
>> same answer?  If 0.92.1, shouldn't matter).
>>
>> St.Ack
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message