hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Magalhaes <pedror...@gmail.com>
Subject Re: yarn.nodemanager.resource.cpu-vcores vs yarn.scheduler.maximum-allocation-vcores
Date Sun, 23 Aug 2015 20:34:10 GMT
Perfect Varun! Now it is clear to me! thanks again and again.

On Sun, Aug 23, 2015 at 5:29 PM, Varun Saxena <vsaxena.varun@gmail.com>
wrote:

> So how does hadoop get this property if it is per node? Does it get the
> minimum of all nodes?
>
> --> No its not minimum of all nodes. Each nodemanager reads this
> configuration from its respective configuration file(yarn-site.xml).
> Nodemanager is like an agent which manages the lifecycle of containers and
> installed on each node where you want to run containers.
> It communicates with resource manager and that is how resource manager
> comes to know about capability of each node. At the time of registration
> with RM, Nodemanager tells about that node's capability to RM(for
> scheduling) by reading above 2 configuration items(one for memory and one
> for vcores).
>
> By capability of node I meant you may have some nodes which has 8 cores
> and some which have 16 cores, for instance. Some may have 16 GB memory and
> some 24 GB.
> So above 2 configurations can be configured accordingly because till
> Hadoop 2.7 we were not getting a node's hardware capability from operating
> system. This will be automatically read from OS(Linux/Windows), if
> configured to do so, from 2.8 onwards.
>
> This is a nodemanager configuration and is not required to be configured
> at the client side while submitting the job.
>
> Regards,
> Varun Saxena
>
>
> On Mon, Aug 24, 2015 at 1:26 AM, Varun Saxena <vsaxena.varun@gmail.com>
> wrote:
>
>> This configuration is read and used by NodeManager, on whichever node its
>> running.
>> If it is not configured, default value will be taken.
>>
>> Regards,
>> Varun Saxena.
>>
>> On Mon, Aug 24, 2015 at 1:21 AM, Pedro Magalhaes <pedrorjbr@gmail.com>
>> wrote:
>>
>>> Thanks Varun! Like we say in Brazil.  "U are the guy!" (Você é o cara!)
>>>
>>> I have another question. You said that:
>>> "yarn.nodemanager.resource.cpu-vcores on the other hand will have to be
>>> configured as per resource capability of that particular node. "
>>>
>>> I get the configuration from my job and printed it:
>>> yarn.nodemanager.resource.cpu-vcores 8
>>> yarn.nodemanager.resource.memory-mb 8192
>>>
>>> So how does hadoop get this property if it is per node? Does it get the
>>> minimum of all nodes? Thanks again!
>>>
>>>
>>>
>>> On Sun, Aug 23, 2015 at 4:40 PM, Varun Saxena <vsaxena.varun@gmail.com>
>>> wrote:
>>>
>>>> The fix would be released in next version(2.8.0).
>>>> I had checked the code to find out the default value and then found it
>>>> fixed in documentation(configuration list).
>>>>
>>>> As this is an unreleased version, a URL link (of the form
>>>> https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml)
>>>> may not be available AFAIK,
>>>> However, this XML(yarn-default.xml) can be checked online in git
>>>> repository.
>>>>
>>>> Associated JIRA which fixes this is
>>>> https://issues.apache.org/jira/browse/YARN-3823
>>>>
>>>> Regards,
>>>> Varun Saxena.
>>>>
>>>> On Mon, Aug 24, 2015 at 12:53 AM, Pedro Magalhaes <pedrorjbr@gmail.com>
>>>> wrote:
>>>>
>>>>> Thanks Varun!
>>>>> Could plz send me the link with the fixed?
>>>>>
>>>>> On Sun, Aug 23, 2015 at 2:20 PM, Varun Saxena <vsaxena.varun@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hi Pedro,
>>>>>>
>>>>>> Real default value of yarn.scheduler.maximum-allocation-vcores is
4.
>>>>>> The value of 32 is actually a documentation issue and has been fixed
>>>>>> recently.
>>>>>>
>>>>>> Regards,
>>>>>> Varun Saxena.
>>>>>>
>>>>>>
>>>>>> On Sun, Aug 23, 2015 at 10:39 PM, Pedro Magalhaes <
>>>>>> pedrorjbr@gmail.com> wrote:
>>>>>>
>>>>>>> Varun,
>>>>>>> Thanks for the reply. I undestand the arn.scheduler.maximum-
>>>>>>> allocation-vcores parameter. I just asking why the default
>>>>>>> parameter is yarn.scheduler.maximum-allocation-vcores=32. And
>>>>>>> yarn.nodemanager.resource.cpu-vcores=8.
>>>>>>>
>>>>>>> In my opinion, if the yarn.scheduler.maximun-allocation-vcore
is 32
>>>>>>> tby default the yarn.nodemanager.resource.cpu-vcores  would be
equal or
>>>>>>> greater than 32, by default.
>>>>>>> Is this make sense?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Aug 23, 2015 at 2:00 PM, Varun Saxena <
>>>>>>> vsaxena.varun@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Pedro,
>>>>>>>>
>>>>>>>> Actual allocation would depend on the total resource capability
>>>>>>>> advertised by NM while registering with RM.
>>>>>>>>
>>>>>>>> yarn.scheduler.maximum-allocation-vcores merely puts an upper
cap on number of vcores which can be allocated by RM i.e. any Resource request/ask from AM
which asks for vcores > 32(default value) for a container, will be normalized back to 32.
>>>>>>>>
>>>>>>>> If there is no such node available, this allocation will
not be fulfilled.
>>>>>>>>
>>>>>>>> yarn.scheduler.maximum-allocation-vcores will be configured
in
>>>>>>>> resource manager and hence will be common for a cluster which
can possibly
>>>>>>>> have multiple nodes with heterogeneous resource capabilities
>>>>>>>>
>>>>>>>> yarn.nodemanager.resource.cpu-vcores on the other hand will
have to
>>>>>>>> be configured as per resource capability of that particular
node.
>>>>>>>>
>>>>>>>> Recently there has been work done to automatically get memory
and
>>>>>>>> CPU information from underlying OS(supported OS being Linux
and Windows) if
>>>>>>>> configured to do so. This change would be available in 2.8
>>>>>>>> I hope this answers your question.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Varun Saxena.
>>>>>>>>
>>>>>>>> On Sun, Aug 23, 2015 at 9:40 PM, Pedro Magalhaes <
>>>>>>>> pedrorjbr@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I was looking at default parameters for:
>>>>>>>>>
>>>>>>>>> yarn.nodemanager.resource.cpu-vcores = 8
>>>>>>>>> yarn.scheduler.maximum-allocation-vcores = 32
>>>>>>>>>
>>>>>>>>> For me this two parameters as default doesnt make any
sense.
>>>>>>>>>
>>>>>>>>> The first one say "the number of CPU cores that can be
allocated
>>>>>>>>> for containers." (I imagine that is vcore) The seconds
says: "The maximum
>>>>>>>>> allocation for every container request at the RM". In
my opinion, the
>>>>>>>>> second one must be equal or less than the first one.
>>>>>>>>>
>>>>>>>>> How can allocate 32 vcores for a container if i have
only 8 cores
>>>>>>>>> available per container?
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message