hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@yahoo-inc.com>
Subject Re: taking lot of time in doing map task after 5% completion
Date Wed, 02 Jul 2008 04:32:23 GMT
On 7/1/08 2:18 PM, "charan@students.iiit.ac.in" <charan@students.iiit.ac.in>
>    We are working on conversion of 1.6 million text data inputs into
> images , for this we are using hadoop but we are having a problem like
> it is performing 1% of this job in 4 minutes and 3%-4% in 1 hr ... and
> it is taking lot of time when it is proceeding to 100% . Is there any
> thing wrong in my hadoop setup or any other problem . Because it works
> too fast when i give a input of 1000 or 5000 taking only 23 sec - 1 min
> 13sec . my created image size will be around 13-30 kilobytes

    It sounds as though you have lots and lots of really small files.  HDFS
doesn't perform well under those conditions and will typically send the name
node java process into a garbage collection tail spin.  Try combining the
data into bigger files.

View raw message