hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Kuzmin <andrey.v.kuz...@gmail.com>
Subject Re: Running Hadoop across data centers
Date Wed, 13 Jan 2010 19:44:08 GMT
Regards,
Andrey




On Wed, Jan 13, 2010 at 10:31 PM, Eric Sammer <eric@lifeless.net> wrote:
> On 1/13/10 2:09 PM, Andrey Kuzmin wrote:
>> On Tue, Jan 12, 2010 at 7:03 PM, Eric Sammer <eric@lifeless.net> wrote:
>>> On 1/12/10 6:01 AM, Antonio Goncalves wrote:
>>>> Thanks Eric and Phil for your inputs.
>>>>
>>>> We have 80% of our calculation that can be done in one datacenter, but the
>>>> rest is heavy calculation. We are using some time consuming algorithm (such
>>>> as Monte Carlo for example) that would take too much time in one datacenter.
>>>> For this kind of computation we are thinking of using the second datacenter
>>>> based in Germany. We haven't done all the study about the data, but I guess
>>>> that for the 80% the data will be local to one datacenter, and for the 20%
>>>> it would have to be distributed across datacenter. What we haven't worked
on
>>>> yet, is the size of this distributed data. If looks it would not be that
big
>>>> (maybe less than a 1Tb but it could grow when doing some calculation based
>>>> on archived data).
>>>
>>> Antonio:
>>>
>>> The point here is that if you build one logical Hadoop cluster across
>>> two data centers, then Hadoop will consider all nodes as candidates for
>>> receiving work regardless of which data center the job is started in.
>>
>> Is Hadoop's job scheduler totally NUMA-unaware? The question holds for
>> both cross-data center scenario being discussed and or single data
>> center as well: just imagine the usual within-rack or between-racks
>> scheduling decision.
>
> Andrey:
>
> Take a look at how replicas are assigned to data nodes[1] to see how
> blocks are distributed. During M/R the job tracker will assign a map
> task to a tracker where the input split is "as close to the data as
> possible." Close is either data local (where the task tracker is running
> on the same machine as the data), rack local (task tracker on a machine
> in the same rack as the data), or not local at all. What I was saying is
> that the third case is always possible (and undesirable).

Right, but this also means that, with NUMA-aware scheduler, a job
distributed between data centers will run as good as it can be achieve
wrt locality. Hence, theoretically there could be computation models
where cross-data center jobs are feasible (disregarding singe
name-node issue) rather than "undesirable".

Regards,
Andrey

>
> [1]
> http://hadoop.apache.org/common/docs/current/hdfs_design.html#Replica+Placement%3A+The+First+Baby+Steps
>
> --
> Eric Sammer
> eric@lifeless.net
> http://esammer.blogspot.com
>

Mime
View raw message