hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Usman Waheed <usm...@opera.com>
Subject Re: Hardware performance from HADOOP cluster
Date Mon, 19 Oct 2009 14:14:16 GMT
Tim,

I have 4 nodes (Quad Core 2.00Ghz, 8GB RAM, 4x1TB disks), where one is 
the master+datanode and the rest are datanodes.

Job: Sort 40GB of random data

With the following current configuration setting:

io.sort.factor: 10
io.sort.mb: 100
io.file.buffer.size: 65536
mapred.child.java.opts: -Xmx200M
dfs.datanode.handler.count=3
2 Mappers
2 Reducer
Time taken: 28 minutes

Still testing with more config changes, will send the results out.

-Usman

 

> Hi all,
>
> I thought I would post the findings of my tuning tests running the
> sort benchmark.
>
> This is all based on 10 machines (1 as masters and 9 DN/TT) each of:
> Dell R300: 2.83G Quadcore (2x6MB cache 1 proc), 8G RAM and 2x500G SATA drives
>
> --- Vanilla installation ---
> 2M 2R: 36 mins
> 4M 4R: 36 mins (yes the same)
>
>
> --- Tuned according to Cloudera http://tinyurl.com/ykupczu ---
> io.sort.factor: 20  (mapred-site.xml)
> io.sort.mb: 200  (mapred-site.xml)
> io.file.buffer.size: 65536   (core-site.xml)
> mapred.child.java.opts: -Xmx512M  (mapred-site.xml)
>
> 2M 2R: 33.5 mins
> 4M 4R: 29 mins
> 8M 8R: 41 mins
>
>
> --- Increasing the task memory a little ---
> io.sort.factor: 20
> io.sort.mb: 200
> io.file.buffer.size: 65536
> mapred.child.java.opts: -Xmx1G
>
> 2M 2R: 29 mins  (adding dfs.datanode.handler.count=8 resulted in 30 mins)
> 4M 4R: 29 mins (yes the same)
>
>
> --- Increasing sort memory ---
> io.sort.factor: 32
> io.sort.mb: 320
> io.file.buffer.size: 65536
> mapred.child.java.opts: -Xmx1G
>
> 2M 2R: 31 mins (yes longer than lower sort sizes)
>
> I am going to stick with the following for now and get back to work...
>   io.sort.factor: 20
>   io.sort.mb: 200
>   io.file.buffer.size: 65536
>   mapred.child.java.opts: -Xmx1G
>   dfs.datanode.handler.count=8
>   4 Mappers
>   4 Reducer
>
> Hope that helps someone.  How did your tuning go Usman?
>
> Tim
>
>
> On Fri, Oct 16, 2009 at 10:41 PM, tim robertson
> <timrobertson100@gmail.com> wrote:
>   
>> No worries Usman,  I will try and do the same on Monday.
>>
>> Thanks Todd for the clarification.
>>
>> Tim
>>
>>
>> On Fri, Oct 16, 2009 at 5:30 PM, Usman Waheed <usmanw@opera.com> wrote:
>>     
>>> Hi Tim,
>>>
>>> I have been swamped with some other stuff so did not get a chance to run
>>> further tests on my setup.
>>> Will send them out early next week so we can compare.
>>>
>>> Cheers,
>>> Usman
>>>
>>>       
>>>> On Fri, Oct 16, 2009 at 4:01 AM, tim robertson
>>>> <timrobertson100@gmail.com>wrote:
>>>>
>>>>
>>>>         
>>>>> Hi all,
>>>>>
>>>>> Adding the following to core-site.xml, mapred-site.xml and
>>>>> hdfs-site.xml (based on Cloudera guidelines:
>>>>> http://tinyurl.com/ykupczu)
>>>>>  io.sort.factor: 15  (mapred-site.xml)
>>>>>  io.sort.mb: 150  (mapred-site.xml)
>>>>>  io.file.buffer.size: 65536   (core-site.xml)
>>>>>  dfs.datanode.handler.count: 3 (hdfs-site.xml  actually this is the
>>>>> default)
>>>>>
>>>>> and using the default of HADOOP_HEAPSIZE=1000 (hadoop-env.sh)
>>>>>
>>>>> Using 2 mappers and 2 reducers, can someone please help me with the
>>>>> maths as to why my jobs are failing with "Error: Java heap space" in
>>>>> the maps?
>>>>> (the same runs fine with io.sort.factor of 10 and io.sort.mb of 100)
>>>>>
>>>>> io.sort.mb of 200 x 4 (2 mappers, 2 reducers) = 0.8G
>>>>> Plus the 2 daemons on the node at 1G each = 1.8G
>>>>> Plus Xmx of 1G for each hadoop daemon task = 5.8G
>>>>>
>>>>> The machines have 8G in them.  Obviously my maths is screwy somewhere...
>>>>>
>>>>>
>>>>>
>>>>>           
>>>> Hi Tim,
>>>>
>>>> Did you also change mapred.child.java.opts? The HADOOP_HEAPSIZE parameter
>>>> is
>>>> for the daemons, not the tasks. If you bump up io.sort.mb you also have to
>>>> bump up the -Xmx argument in mapred.child.java.opts to give the actual
>>>> tasks
>>>> more RAM.
>>>>
>>>> -Todd
>>>>
>>>>
>>>>
>>>>         
>>>>> On Fri, Oct 16, 2009 at 9:59 AM, Erik Forsberg <forsberg@opera.com>
>>>>> wrote:
>>>>>
>>>>>           
>>>>>> On Thu, 15 Oct 2009 11:32:35 +0200
>>>>>> Usman Waheed <usmanw@opera.com> wrote:
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> Hi Todd,
>>>>>>>
>>>>>>> Some changes have been applied to the cluster based on the
>>>>>>> documentation (URL) you noted below,
>>>>>>>
>>>>>>>               
>>>>>> I would also like to know what settings people are tuning on the
>>>>>> operating system level. The blog post mentioned here does not mention
>>>>>> much about that, except for the fileno changes.
>>>>>>
>>>>>> We got about 3x the read performance when running DFSIOTest by mounting
>>>>>> our ext3 filesystems with the noatime parameter. I saw that mentioned
>>>>>> in the slides from some Cloudera presentation.
>>>>>>
>>>>>> (For those who don't know, the noatime parameter turns off the
>>>>>> recording of access time on files. That's a horrible performance
killer
>>>>>> since it means every read of a file also means that the kernel must
do
>>>>>> a write. These writes are probably queued up, but still, if you don't
>>>>>> need the atime (very few applications do), turn it off!)
>>>>>>
>>>>>> Have people been experimenting with different filesystems, or are
most
>>>>>> of us running on top of ext3?
>>>>>>
>>>>>> How about mounting ext3 with "data=writeback"? That's rumoured to
give
>>>>>> the best throughput and could help with write performance. From
>>>>>> mount(8):
>>>>>>
>>>>>>    writeback
>>>>>>           Data ordering is not preserved - data may be written into
the
>>>>>>
>>>>>>             
>>>>> main file system
>>>>>
>>>>>           
>>>>>>           after its metadata has been  committed  to the journal.
 This
>>>>>>
>>>>>>             
>>>>> is rumoured to be the
>>>>>
>>>>>           
>>>>>>           highest throughput option.  It guarantees internal file
system
>>>>>>
>>>>>>             
>>>>> integrity,
>>>>>
>>>>>           
>>>>>>           however it can allow old data to appear in files after
a crash
>>>>>>
>>>>>>             
>>>>> and journal recovery.
>>>>>
>>>>>           
>>>>>> How would the HDFS consistency checks cope with old data appearing
in
>>>>>> the unerlying files after a system crash?
>>>>>>
>>>>>> Cheers,
>>>>>> \EF
>>>>>> --
>>>>>> Erik Forsberg <forsberg@opera.com>
>>>>>> Developer, Opera Software - http://www.opera.com/
>>>>>>
>>>>>>
>>>>>>             
>>>>         
>>>       
>
>   


Mime
View raw message