hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pratyush Banerjee <pratyushbaner...@aol.com>
Subject Re: Hadoop-Archive Error for size of input data >2GB
Date Tue, 22 Jul 2008 05:49:06 GMT
Thanks Mahadev,
Thanks for letting me know of the patch. I have already applied it and 
the archiving seems to run fine for input directory size of about 5GB.

Currently am testing the same programatically,  but since it is working 
from the command line, it should ideally also work this way.

thanks and regards~

Pratyush

mahadev@yahoo-inc.com wrote:
> HI Pratyush,
>
>   I think this bug was fixed in
> https://issues.apache.org/jira/browse/HADOOP-3545.
>
> Can you apply the patch and see if it works?
>
> Mahadev
>
>
> On 7/21/08 5:56 AM, "Pratyush Banerjee" <pratyushbanerjee@aol.com> wrote:
>
>   
>> Hi All,
>>
>> I have been using hadoop archives programmatically  to generate  har
>> archives from some logfiles  which are being dumped into the hdfs.
>>
>> When the input directory to Hadoop Archiving program has files of size
>> more than 2GB, strangely the archiving fails with a error message saying
>>
>> INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker, sessionId=   Illegal Capacity: -1
>>
>> Going into the code i found out that this was due to numMaps having the
>> Value of -1.
>>
>> As per the code in org.apache.hadoop.util.HadoopArchives:
>> archive(List<Path> srcPaths, String archiveName, Path dest)
>>
>> the numMaps is initialized as
>> int numMaps = (int)(totalSize/partSize);
>> //run atleast one map.
>> conf.setNumMapTasks(numMaps == 0? 1:numMaps);
>>
>> partSize has been statically assigned the value of 2GB in the beginning
>> of the class as,
>>
>> static final long partSize = 2 * 1024 * 1024 * 1024
>>
>> Strangely enough, the value i find assigned to partSize is  =  - 2147483648
>>
>> Hence as a result in case of input directories of greater size, numMaps
>> is assigned -1 which leads to the code throwing up error.
>>
>> I am using hadoop-0.17.1 and I got the archiving facility after applying
>> the patch hadoop-3307_4 patch.
>>
>> This looks like a bug for me, so please let me know how to go about it.
>>
>> Pratyush Banerjee
>>
>>     
>
>   


Mime
View raw message