hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Shvachko <...@yahoo-inc.com>
Subject Re: Adding new filesystem to Hadoop causing too many Map tasks
Date Tue, 05 Jun 2007 21:32:45 GMT
Could you pls check your ant version.
Should be at least 1.6.3 I believe.
--Konst

Esteban Molina-Estolano wrote:

> Thanks for the advice. I am using an old version.
> I'm trying to upgrade to 0.12.3, but when I try to compile (even  
> without adding in my own code) I get:
>
> [eestolan@issdm-1 ~/hadoop-0.12.3]$ ant
> Buildfile: build.xml
>
> init:
>
> BUILD FAILED
> /cse/grads/eestolan/hadoop-0.12.3/build.xml:114: Specify at least one  
> source--a file or resource collection.
>
> Total time: 0 seconds
>
> That line in build.xml has the following:
>
>     <touch datetime="01/25/1971 2:00 pm">
>       <fileset dir="${conf.dir}" includes="**/*.template"/>
>       <fileset dir="${contrib.dir}" includes="**/*.template"/>
>     </touch>
>
> What might be causing the error?
>
> Thanks,
>     ~ Esteban
>
>
> On Jun 1, 2007, at 9:26 AM, Owen O'Malley wrote:
>
>>
>> On Jun 1, 2007, at 1:14 AM, Esteban Molina-Estolano wrote:
>>
>>> I'm having trouble with a small test: RandomWriter, 4 TaskTracker  
>>> nodes, 5 maps per node, 10 MB per map, for a total of 200 MB over  
>>> 20 Map tasks. I tried it on Hadoop with DFS, and it took about 30  
>>> seconds. Then, I ran the same test using Ceph. I changed  
>>> fs.default.name to "ceph:///"; added fs.ceph.impl as  
>>> org.apache.hadoop.fs.ceph.CephFileSystem; and left all other  
>>> configuration settings untouched. It ran horrifically slowly.
>>>
>>> Then the JobTracker spawned 400 Map tasks:
>>>
>>> I'm ending up with way too many Map tasks, and as a result the job  
>>> takes way too long to run.
>>
>>
>> That is really strange, especially because RandomWriter isn't  
>> looking at any real inputs. (Unless you are using version 0.11 or  
>> earlier of Hadoop...)  Are you using an old version of Hadoop? If  
>> so, I'd suspect it has something to do with the blocksize for the  
>> input files being too small (likely 1 byte or so). You need to  
>> return much bigger numbers for FileSystem.getBlockSize(Path) or map/ 
>> reduce will default to making very small input splits.
>>
>> -- Owen
>
>
>


Mime
View raw message