hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mag Gam <magaw...@gmail.com>
Subject Re: rack awareness help
Date Fri, 19 Mar 2010 00:41:55 GMT
Chris:

This clears up my questions a lot! Thankyou.

So, if I have 4 data servers and I want 2 racks. I can do this

#!/bin/bash
#rack1.sh
echo rack1

#bin/bash
#rack2.sh
echo rack2


So, I can do this for 2 servers


<property>
 <name>topology.script.file.name</name>
 <value>rack1.sh</value>
</property>

And for the other 2 servers, I can do this:


<property>
 <name>topology.script.file.name</name>
 <value>rack2.sh</value>
</property>


correct?


On Thu, Mar 18, 2010 at 3:15 AM, Christopher Tubbs <ctubbsii@gmail.com> wrote:
> Hadoop will identify data nodes in your cluster by name and execute
> your script with the data node as an argument. The expected output of
> your script is the name of the rack on which it is located.
>
> The script you referenced takes the node name as an argument ($1), and
> crawls through a separate file looking up that node in the left
> column, and printing the value in the second column if it finds it.
>
> If you were to use this script, you would just create the topology
> file that lists all your nodes by name/ip on the left and the rack
> they are in on the right.
>
> On Wed, Mar 17, 2010 at 11:34 PM, Mag Gam <magawake@gmail.com> wrote:
>> Well,  I didn't really solve the problem. Now I have even more questions.
>>
>> I came across this script,
>> http://wiki.apache.org/hadoop/topology_rack_awareness_scripts
>>
>> but it makes no sense to me! Can someone please try to explain what
>> its trying to do?
>>
>>
>> MikeThomas:
>>
>> Your script isn't working for me. I think there are some syntax
>> errors. Is this how its supposed to look: http://pastebin.ca/1844287
>>
>> thanks
>>
>>
>>
>> On Thu, Mar 4, 2010 at 10:30 PM, Jeff Hammerbacher <hammer@cloudera.com> wrote:
>>> Hey Mag,
>>>
>>> Glad you have solved the problem. I've created a JIRA ticket to improve the
>>> existing documentation: https://issues.apache.org/jira/browse/HADOOP-6616.
>>> If you have some time, it would be useful to hear what could be added to the
>>> existing documentation that would have helped you figure this out sooner.
>>>
>>> Thanks,
>>> Jeff
>>>
>>> On Thu, Mar 4, 2010 at 3:39 PM, Mag Gam <magawake@gmail.com> wrote:
>>>
>>>> Thanks everyone for explaining this to me instead of giving me RTFM!
>>>>
>>>> I will play around with it and see how far I get.
>>>>
>>>>
>>>>
>>>> On Thu, Mar 4, 2010 at 9:21 AM, Steve Loughran <stevel@apache.org>
wrote:
>>>> > Allen Wittenauer wrote:
>>>> >>
>>>> >> On 3/3/10 5:01 PM, "Mag Gam" <magawake@gmail.com> wrote:
>>>> >>
>>>> >>> Thanks Alan! Your presentation is very nice!
>>>> >>
>>>> >> Thanks. :)
>>>> >>
>>>> >>> "If you don't provide a script for rack awareness, it treats
every
>>>> >>> node as if it was its own rack". I am using the default settings
and
>>>> >>> the report still says only 1 rack.
>>>> >>
>>>> >> Let's take a different approach to convince you. :)
>>>> >>
>>>> >> Think about the question:  Is there a difference between all nodes
in
>>>> one
>>>> >> rack vs. every node acting as a lone rack?
>>>> >>
>>>> >> The answer is no, there isn't any difference.  In both cases, all
copies
>>>> >> of
>>>> >> the blocks can go to pretty much any node. When a MR job runs, every
>>>> node
>>>> >> would either be considered 'off rack' or 'rack-local'.
>>>> >>
>>>> >> So there is no difference.
>>>> >>
>>>> >>
>>>> >>> Do you mind sharing a script with us on how you determine a
rack? and
>>>> >>> a sample <configuration> </configuration> syntax?
>>>> >>
>>>> >> Michael has already posted his, so I'll skip this one. :)
>>>> >>
>>>> >
>>>> > Think Mag probably wanted a shell script.
>>>> >
>>>> > Mag, give your machines IPv4 addresses that map to rack number. 10.1.1.*
>>>> for
>>>> > rack one, 10.1.2.* for rack 2, etc. Then just filter out the IP address
>>>> by
>>>> > the top bytes, returning "10.1.1" for everything in rack one, "10.1.2"
>>>> for
>>>> > rack 2; Hadoop will be happy
>>>> >
>>>>
>>>
>>
>

Mime
View raw message