hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Whelan <phil...@gmail.com>
Subject Re: TeraSort question.
Date Tue, 11 Jan 2011 06:39:29 GMT
Hi Raj,

> Two of the 5 systems were seriously busy, big IO with lots of disk and network activity.
The other three systems, CPU was more or less 100% idle, slight network and I/O.

This process defaults to just 2 map jobs, so only 2 nodes are
utilized. Did you try this option? mapred.map.tasks. I found a very
similar question + answer here...

http://www.mail-archive.com/common-user@hadoop.apache.org/msg00005.html

>> 1.      The data is generated in a fashion to where it is not balanced
>> across my cluster.  This is because the data is generated with 2 maps.
>
> These are due to the default #maps/#reduces in Map-Reduce.
> Use:
> $ bin/hadoop jar hadoop-*-dev-examples.jar teragen - Dmapred.map.tasks=8000 10000000000
/tera/in $ bin/hadoop jar hadoop-*-dev-examples.jar terasort - Dmapred.reduce.tasks=5300 /tera/in
/tera/out
> Arun

Hope that helps.

Thanks,
Phil

On Mon, Jan 10, 2011 at 9:06 PM, Raj V <rajvish@yahoo.com> wrote:
> All,
>
> I have been running terasort on a 480 node hadoop cluster. I have also collected cpu,memory,disk,
network statistics during this run. The system stats are quite intersting. I can post it when
I have put them together in some presentable format ( if there is interest.). However while
looking at the data, I noticed something interesting.
>
>  I thought, intutively, that the all the systems in the cluster would have more or less
similar behaviour ( time translation was possible) but the overall graph would look the same.,
>
> Just to confirm it I took 5 random nodes and looked at the CPU, disk ,network etc. activity
when the sort was running. Strangeley enough, it was not so., Two of the 5 systems were seriously
busy, big IO with lots of disk and network activity. The other three systems, CPU was more
or less 100% idle, slight network and I/O.
>
> Is that normal and/or expected? SHouldn't all the nodes be utilized in more or less manner
over the length of the run?
>
> I generated the data forf the sort using teragen. ( 128MB bloick size, replication =3).
>
> I would also be interested in other people timings of sort. Is there some place where
people can post sort numbers ( not just the record.)
>
> I will post the actual graphs of the 5 nodes, if there is interest, tomorrow. ( Some
logistical issues abt. posting them tonight)
>
> I am using CDH3B3, even though I think this is not specific to CDH3B3.
>
> Sorry for the cross post.
>
> Raj

Mime
View raw message