hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bikas Saha <bi...@hortonworks.com>
Subject RE: Allocating Containers on a particular Node in Yarn
Date Tue, 19 Nov 2013 17:29:34 GMT
I think https://issues.apache.org/jira/browse/YARN-1412 was opened for
this. I am afraid without the RM debug logs it will be hard to diagnose
what is being observed.

Bikas

-----Original Message-----
From: Arun C Murthy [mailto:acm@hortonworks.com]
Sent: Tuesday, November 19, 2013 7:30 AM
To: yarn-dev@hadoop.apache.org
Subject: Re: Allocating Containers on a particular Node in Yarn

Sorry, I'm a little lost here.

Can you please summarize the issue you are seeing? I can try help. Thanks.

On Nov 14, 2013, at 7:55 PM, Gaurav Gupta <gaurav@datatorrent.com> wrote:

> Even after setting node-locality-delay to 50, it is not working
>
> -----Original Message-----
> From: Gaurav Gupta [mailto:gaurav@datatorrent.com]
> Sent: Thursday, November 14, 2013 7:03 PM
> To: yarn-dev@hadoop.apache.org
> Subject: RE: Allocating Containers on a particular Node in Yarn
>
> There are some other small applications running but the resources are
> available on every node of cluster so resources should not be problem.
>
> Following is the node-locality-delay setting <property>
> <name>yarn.scheduler.capacity.node-locality-delay</name>
>     <value>1</value>
>     <description>
>       Number of missed scheduling opportunities after which the
> CapacityScheduler
>       attempts to schedule rack-local containers.
>       Typically this should be set to number of racks in the cluster,
this
>       feature is disabled by default, set to -1.
>     </description>
>   </property>
>
> I am attaching logs from the Application Master which shows the
> request being made and resources I am getting back
>
> 2013-11-14 18:48:09,016 main INFO  util.RackResolver
> (RackResolver.java:coreResolve(109)) - Resolved node10.morado.com to
> /default-rack
> 2013-11-14 18:48:09,017 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(540)) - Added priority=0
> 2013-11-14 18:48:09,023 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(570)) - addResourceRequest:
> applicationId= priority=0 resourceName=node10.morado.com
> numContainers=1
> #asks=1
> 2013-11-14 18:48:09,023 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(570)) - addResourceRequest:
> applicationId= priority=0 resourceName=/default-rack numContainers=1
> #asks=2
> 2013-11-14 18:48:09,023 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(570)) - addResourceRequest:
> applicationId= priority=0 resourceName=* numContainers=1 #asks=3
> 2013-11-14 18:48:09,024 main INFO  stram.StramAppMaster
> (StramAppMaster.java:sendContainerAskToRM(925)) - Requested container:
> Capability[<memory:8192, vCores:0>]Priority[1]
> 2013-11-14 18:48:09,024 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(540)) - Added priority=1
> 2013-11-14 18:48:09,024 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(570)) - addResourceRequest:
> applicationId= priority=1 resourceName=* numContainers=1 #asks=4
> 2013-11-14 18:48:09,024 main INFO  stram.StramAppMaster
> (StramAppMaster.java:sendContainerAskToRM(925)) - Requested container:
> Capability[<memory:8192, vCores:0>]Priority[2]
> 2013-11-14 18:48:09,024 main INFO  util.RackResolver
> (RackResolver.java:coreResolve(109)) - Resolved node18.morado.com to
> /default-rack
> 2013-11-14 18:48:09,024 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(540)) - Added priority=2
> 2013-11-14 18:48:09,024 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(570)) - addResourceRequest:
> applicationId= priority=2 resourceName=node18.morado.com
> numContainers=1
> #asks=5
> 2013-11-14 18:48:09,025 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(570)) - addResourceRequest:
> applicationId= priority=2 resourceName=/default-rack numContainers=1
> #asks=6
> 2013-11-14 18:48:09,025 main DEBUG impl.AMRMClientImpl
> (AMRMClientImpl.java:addResourceRequest(570)) - addResourceRequest:
> applicationId= priority=2 resourceName=* numContainers=1 #asks=7
> 2013-11-14 18:48:10,063 main INFO  stram.StramAppMaster
> (StramAppMaster.java:execute(764)) - Got new container.,
> containerId=container_1384399307129_0027_01_000002,
> containerNode=node8.morado.com:51530,
> containerNodeURI=node8.morado.com:8042, containerResourceMemory8192,
> priority0
> 2013-11-14 18:48:10,218 main INFO  stram.StramAppMaster
> (StramAppMaster.java:execute(764)) - Got new container.,
> containerId=container_1384399307129_0027_01_000003,
> containerNode=node8.morado.com:51530,
> containerNodeURI=node8.morado.com:8042, containerResourceMemory8192,
> priority1
> 2013-11-14 18:48:10,235 main INFO  stram.StramAppMaster
> (StramAppMaster.java:execute(764)) - Got new container.,
> containerId=container_1384399307129_0027_01_000004,
> containerNode=node37.morado.com:50631,
> containerNodeURI=node37.morado.com:8042, containerResourceMemory8192,
> priority2
>
> -Gaurav
>
> -----Original Message-----
> From: Bikas Saha [mailto:bikas@hortonworks.com]
> Sent: Thursday, November 14, 2013 6:37 PM
> To: yarn-dev@hadoop.apache.org
> Subject: RE: Allocating Containers on a particular Node in Yarn
>
> What else is running on the cluster? What is the locality delay value
> set to? This value is not time. It is the number of node heartbeat to
> wait before assigning a rack local container. So if those many nodes
> heartbeated to the RM before the RM could assign a node local machine
> to that request then it will assign a rack local machine.
>
> It is interesting that if you don't specify the rack, ie. you want the
> exact machine, even then you are not getting the exact machine. You
> should either get the exact machine or your request will not be
> fulfilled. You should never get a different machine. If this is what
> you observe then please open a bug on Jira and attach the RM logs
> mentioning the machine name and container id that were erroneous. You
> will probably have to enable debug logs on the RM before you get the
repro.
>
> Bikas
>
> -----Original Message-----
> From: Gaurav Gupta [mailto:gaurav@datatorrent.com]
> Sent: Thursday, November 14, 2013 5:48 PM
> To: yarn-dev@hadoop.apache.org
> Subject: RE: Allocating Containers on a particular Node in Yarn
>
> Hi Bikas,
>
> With scheduler delay on and relax locality set to true (with and
> without Requesting the rack), I don't get the containers on the
> required host. It always assign to different host.
> I am using default Capacity scheduler. Here is the snippet of the code
>
>    AMRMClient<ContainerRequest> amRmClient =
> AMRMClient.createAMRMClient();;
>    String host = "h1";
>    Resource capability = Records.newRecord(Resource.class);
>    capability.setMemory(memory);
>    nodes = new String[] {host};
>    // in order to request a host, we also have to request the rack
>    racks = new String[] {"/default-rack"};
>     List<ContainerRequest> containerRequests = new
> ArrayList<ContainerRequest>();
>    List<ContainerId> releasedContainers = new ArrayList<ContainerId>();
>    containerRequests.add(new ContainerRequest(capability, nodes,
> racks, Priority.newInstance(priority),false));
>    if (containerRequests.size() > 0) {
>      LOG.info("Asking RM for containers: " + containerRequests);
>      for (ContainerRequest cr : containerRequests) {
>        LOG.info("Requested container: {}", cr.toString());
>        amRmClient.addContainerRequest(cr);
>      }
>    }
>
>    for (ContainerId containerId : releasedContainers) {
>      LOG.info("Released container, id={}", containerId.getId());
>      amRmClient.releaseAssignedContainer(containerId);
>    }
>    return amRmClient.allocate(0);
>
>
>
> Thanks
> Gaurav
>
> -----Original Message-----
> From: Bikas Saha [mailto:bikas@hortonworks.com]
> Sent: Wednesday, November 13, 2013 7:05 PM
> To: yarn-dev@hadoop.apache.org
> Subject: RE: Allocating Containers on a particular Node in Yarn
>
> What you ask, try on request node and then fallback to others, is the
> default behavior for current schedulers in yarn. Ie. relaxLocality is
> true by default.
>
> -----Original Message-----
> From: Thomas Weise [mailto:thomas.weise@gmail.com]
> Sent: Wednesday, November 13, 2013 3:55 PM
> To: yarn-dev@hadoop.apache.org
> Subject: Re: Allocating Containers on a particular Node in Yarn
>
> Is it possible to specify a particular node and have RM fallback to an
> different node only after making an attempt to allocate for the
> requested node? In other words, is the combination of specific host
> name and relaxLocality=TRUE meaningful at all?
>
> Thanks.
>
>
> On Wed, Nov 13, 2013 at 3:23 PM, Alejandro Abdelnur
> <tucu@cloudera.com>wrote:
>
>> Gaurav,
>>
>> Setting relaxLocality to FALSE should do it.
>>
>> thanks.
>>
>>
>> On Wed, Nov 13, 2013 at 2:58 PM, gaurav <gaurav@datatorrent.com> wrote:
>>
>>> Hi,
>>> I am trying to allocate containers on a particular node in Yarn but
>>> Yarn is returning me containers on different node although the
>>> requested node has resources available.
>>>
>>> I checked into the allocate(AllocateRequest request) function of
>>> ApplicationMasterService and my request is as follows
>>>
>>> *request: ask { priority { priority: 1 } resource_name: "h2"
>>> capability {
>>> memory: 1000 } num_containers: 2 } ask { priority { priority: 1 }
>>> resource_name: "/default-rack" capability { memory: 1000 }
>> num_containers:
>>> 2 } ask { priority { priority: 1 } resource_name: "*" capability {
>> memory:
>>> 1000 } num_containers: 2 } response_id: 1 progress: 0.0*
>>>
>>> but the containers that I am getting back is as follows
>>> [Container: [ContainerId: container_1384381084244_0001_01_000002,
> NodeId:
>>> h1:1234, NodeHttpAddress: h1:2, Resource: <memory:1024, vCores:1>,
>>> Priority: 1, Token: Token { kind: ContainerToken, service: h1:1234
>>> }, ],
>>> Container: [ContainerId: container_1384381084244_0001_01_000003,
> NodeId:
>>> h1:1234, NodeHttpAddress: h1:2, Resource: <memory:1024, vCores:1>,
>>> Priority: 1, Token: Token { kind: ContainerToken, service: h1:1234
>>> }, ]]
>>>
>>> I am attaching the test case that I have written along with the
>>> mail. It uses classes under
> org.apache.hadoop.yarn.server.resourcemanager package.
>>>
>>> Any pointers would be of great help
>>>
>>> Thanks
>>> Gaurav
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Alejandro
>>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or
> entity to which it is addressed and may contain information that is
> confidential, privileged and exempt from disclosure under applicable
> law. If the reader of this message is not the intended recipient, you
> are hereby notified that any printing, copying, dissemination,
> distribution, disclosure or forwarding of this communication is
> strictly prohibited. If you have received this communication in error,
> please contact the sender immediately and delete it from your system.
Thank You.
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or
> entity to which it is addressed and may contain information that is
> confidential, privileged and exempt from disclosure under applicable
> law. If the reader of this message is not the intended recipient, you
> are hereby notified that any printing, copying, dissemination,
> distribution, disclosure or forwarding of this communication is
> strictly prohibited. If you have received this communication in error,
> please contact the sender immediately and delete it from your system.
Thank You.
>
>

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity
to which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified
that any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender
immediately and delete it from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message