hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Костарев А.Ф." <...@ics.perm.ru>
Subject Re: Algorithm of distribution Map and Reduce tasks at various topology of a network
Date Tue, 09 Jul 2013 11:36:38 GMT
Hi Junping

Thank you for your prompt response

We will try to repeat the test on Thursday and show more details


On 07/09/2013 05:18 PM, Jun Ping Du wrote:
> Hi Костарев,
>    I think it should work for YARN even YARN doesn't support layer above rack (actually
I am working on supporting more layers topology for YARN at YARN-18) now.
>    Current YARN should just recognize your topology as three racks: "dc1/rack1", "dc2/rack1",
"dc2/rack2". Each node (NM) with free resources should be assigned with containers in heartbeat
with RM no matter what locality level there. The only exception case should be: 1. no pending
resource requests 2. NM capacity is too small to meet resource request 3. delay scheduling
is enabled and no data-local attempt. In your case, I don't see anything stop task assignment
on a1 and a2. Anyone here can correct me if any misunderstanding here. :)
>    Anyway, I will give it a try (as your configuration) later to see if some bugs in
boundary cases there or it could be some misconfiguration. Which minor version (2.0.x or trunk)
you are using now?
>
> Thanks,
>
> Junping
>
> ----- Original Message -----
> From: "Костарев А.Ф." <kaf@ics.perm.ru>
> To: yarn-dev@hadoop.apache.org
> Sent: Tuesday, July 9, 2013 5:48:49 PM
> Subject: Algorithm of distribution Map and Reduce tasks at various topology of a network
>
> Hi
> I have claster in two datacenters
>
>             CLUSTER
>                |
>       +--------+---------+
>       |                  |
> datacenter1        datacenter2
>       |                  |
>     rack1               rack1
>         |                |  |
>         +-a1             |  +-b1
>         |                |  |
>         +-a2             |  +-b3
>                          |
>                         rack2
>                             +-b3
>
>
> Cluster have file with repcica coefficient=5
> All files's blocks resides on all servers of cluser.
>
> When I work with standart MapReduce (MRv1) (called on b1) Map and
> Rediuce task runs on all servers b1, b2, b3, a1, a2
> When I work with YARN (MRv2) (called on b1) Map and Reduce task runs
> only on b1, b2, b3
>
> Can I run in YARN Map tasks on all servers?
>
>


-- 
Консультант 1-й категории
Костарев А.Ф.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message