hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SACHINGUPTA <sac...@datametica.com>
Subject Re: number of mappers allowed in a container in hadoop2
Date Wed, 15 Oct 2014 14:18:49 GMT
thanks for the reply

i have one more doubt

are there three kinds of containers with different memory sizes in hadoop 2

1.normal container
2.map task container
3.reduce task container

On Wednesday 15 October 2014 07:33 PM, Shahab Yunus wrote:
> The data that the each map task will process is different from the 
> memory the task itself might require depending upon whatever 
> processing that you plan to do in the task.
>
> Very trivial example: Let us say your map gets 128mb input data but 
> your task logic is such that it creates lots of String objects and 
> ArrayList objects, then wouldn't your memory requirement for the task 
> be greater than your input data?
>
> I think you are confusing the size of the input data to the map/task 
> with the actual memory required by the map/task itself to do its work.
>
> Regards,
> Shahab
>
> On Wed, Oct 15, 2014 at 9:44 AM, SACHINGUPTA <sachin@datametica.com 
> <mailto:sachin@datametica.com>> wrote:
>
>     it is still not clear to me
>     lets suppose block size of my hdfs is 128 mb so every mapper will
>     process only 128 mb of data
>     then what is the meaning of setting the property
>     mapreduce.map.memory.mb that is already known from the block size
>     then why this property
>
>
>
>     On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
>>     Explanation here.
>>
>>     http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
>>     https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
>>     http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
>>     (scroll towards the end.)
>>
>>     Regards,
>>     Shahab
>>
>>     On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA
>>     <sachin@datametica.com <mailto:sachin@datametica.com>> wrote:
>>
>>         I have one more doubt i was reading this
>>
>>         http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>
>>         there is one property as
>>
>>         mapreduce.map.memory.mb 	= 2*1024 MB
>>         mapreduce.reduce.memory.mb 	= 2 * 2 = 4*1024 MB
>>
>>
>>         what are these properties mapreduce.map.memory.mb and
>>         mapreduce.reduce.memory.mb
>>
>>         On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
>>>         It cannot run more mappers (tasks) in parallel than the
>>>         underlying cores available. Just like it cannot run multiple
>>>         mappers in parallel if each mapper's (task's) memory
>>>         requirements are greater than allocated and available
>>>         container size configured on each node.
>>>
>>>         The links that I provided earlier...see the following
>>>         section in that one:
>>>         Section:"Configuring YARN"
>>>
>>>         Also this:
>>>         http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
>>>         Section "1. YARN Concurrency (aka “What Happened to Slots?”)"
>>>
>>>         This should help in putting things in perspective regarding
>>>         how resource allocation for each task, container and
>>>         resources available on the node relate to each other.
>>>
>>>         Regards,
>>>         Shahab
>>>
>>>         On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
>>>         <sachin@datametica.com <mailto:sachin@datametica.com>> wrote:
>>>
>>>             but Shahab if i have only 4 core machine then how yarn
>>>             can run more then 4 mappers in parallel
>>>             On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
>>>>             It depends on memory settings as well, that how much
>>>>             you want to assign resources to each container. Then
>>>>             yarn will run as many mappers in parallel as possible.
>>>>
>>>>             See this:
>>>>             http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
>>>>             http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html
>>>>
>>>>             Regards,
>>>>             Shahab
>>>>
>>>>             On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
>>>>             <sachin@datametica.com <mailto:sachin@datametica.com>>
>>>>             wrote:
>>>>
>>>>                 Hi guys
>>>>
>>>>                 I have situation in which i have machine with 4
>>>>                 processor and i have 5 containers so does it mean i
>>>>                 can have only 4 mappers running parallely at a time
>>>>
>>>>                 and number of mappers is not dependent on the
>>>>                 number of containers in a machine then what is the
>>>>                 use of container concept
>>>>
>>>>                 sorry if i have asked anything obvious.
>>>>
>>>>                 -- 
>>>>                 Thanks
>>>>                 Sachin Gupta
>>>>
>>>>
>>>
>>>             -- 
>>>             Thanks
>>>             Sachin Gupta
>>>
>>>
>>
>>         -- 
>>         Thanks
>>         Sachin Gupta
>>
>>
>
>     -- 
>     Thanks
>     Sachin Gupta
>
>

-- 
Thanks
Sachin Gupta


Mime
View raw message