hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Thomas <tho...@hep.caltech.edu>
Subject Re: rack awareness help
Date Fri, 19 Mar 2010 01:21:11 GMT
On 03/17/2010 08:34 PM, Mag Gam wrote:
> Well,  I didn't really solve the problem. Now I have even more questions.
>
> I came across this script,
> http://wiki.apache.org/hadoop/topology_rack_awareness_scripts
>
> but it makes no sense to me! Can someone please try to explain what
> its trying to do?
>
>
> MikeThomas:
>
> Your script isn't working for me. I think there are some syntax
> errors. Is this how its supposed to look: http://pastebin.ca/1844287

Not quite.  A couple of lines got incorrectly wrapped.  It should look 
like this:

http://pastebin.ca/1845286

--Mike

> On Thu, Mar 4, 2010 at 10:30 PM, Jeff Hammerbacher<hammer@cloudera.com>  wrote:
>> Hey Mag,
>>
>> Glad you have solved the problem. I've created a JIRA ticket to improve the
>> existing documentation: https://issues.apache.org/jira/browse/HADOOP-6616.
>> If you have some time, it would be useful to hear what could be added to the
>> existing documentation that would have helped you figure this out sooner.
>>
>> Thanks,
>> Jeff
>>
>> On Thu, Mar 4, 2010 at 3:39 PM, Mag Gam<magawake@gmail.com>  wrote:
>>
>>> Thanks everyone for explaining this to me instead of giving me RTFM!
>>>
>>> I will play around with it and see how far I get.
>>>
>>>
>>>
>>> On Thu, Mar 4, 2010 at 9:21 AM, Steve Loughran<stevel@apache.org>  wrote:
>>>> Allen Wittenauer wrote:
>>>>>
>>>>> On 3/3/10 5:01 PM, "Mag Gam"<magawake@gmail.com>  wrote:
>>>>>
>>>>>> Thanks Alan! Your presentation is very nice!
>>>>>
>>>>> Thanks. :)
>>>>>
>>>>>> "If you don't provide a script for rack awareness, it treats every
>>>>>> node as if it was its own rack". I am using the default settings
and
>>>>>> the report still says only 1 rack.
>>>>>
>>>>> Let's take a different approach to convince you. :)
>>>>>
>>>>> Think about the question:  Is there a difference between all nodes in
>>> one
>>>>> rack vs. every node acting as a lone rack?
>>>>>
>>>>> The answer is no, there isn't any difference.  In both cases, all copies
>>>>> of
>>>>> the blocks can go to pretty much any node. When a MR job runs, every
>>> node
>>>>> would either be considered 'off rack' or 'rack-local'.
>>>>>
>>>>> So there is no difference.
>>>>>
>>>>>
>>>>>> Do you mind sharing a script with us on how you determine a rack?
and
>>>>>> a sample<configuration>  </configuration>  syntax?
>>>>>
>>>>> Michael has already posted his, so I'll skip this one. :)
>>>>>
>>>>
>>>> Think Mag probably wanted a shell script.
>>>>
>>>> Mag, give your machines IPv4 addresses that map to rack number. 10.1.1.*
>>> for
>>>> rack one, 10.1.2.* for rack 2, etc. Then just filter out the IP address
>>> by
>>>> the top bytes, returning "10.1.1" for everything in rack one, "10.1.2"
>>> for
>>>> rack 2; Hadoop will be happy
>>>>
>>>
>>



Mime
View raw message