hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahadev Konar <maha...@yahoo-inc.com>
Subject Re: Hadoop-Archive Error for size of input data >2GB
Date Mon, 21 Jul 2008 18:18:00 GMT
HI Pratyush,

  I think this bug was fixed in
https://issues.apache.org/jira/browse/HADOOP-3545.

Can you apply the patch and see if it works?

Mahadev


On 7/21/08 5:56 AM, "Pratyush Banerjee" <pratyushbanerjee@aol.com> wrote:

> Hi All,
> 
> I have been using hadoop archives programmatically  to generate  har
> archives from some logfiles  which are being dumped into the hdfs.
> 
> When the input directory to Hadoop Archiving program has files of size
> more than 2GB, strangely the archiving fails with a error message saying
> 
> INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=   Illegal Capacity: -1
> 
> Going into the code i found out that this was due to numMaps having the
> Value of -1.
> 
> As per the code in org.apache.hadoop.util.HadoopArchives:
> archive(List<Path> srcPaths, String archiveName, Path dest)
> 
> the numMaps is initialized as
> int numMaps = (int)(totalSize/partSize);
> //run atleast one map.
> conf.setNumMapTasks(numMaps == 0? 1:numMaps);
> 
> partSize has been statically assigned the value of 2GB in the beginning
> of the class as,
> 
> static final long partSize = 2 * 1024 * 1024 * 1024
> 
> Strangely enough, the value i find assigned to partSize is  =  - 2147483648
> 
> Hence as a result in case of input directories of greater size, numMaps
> is assigned -1 which leads to the code throwing up error.
> 
> I am using hadoop-0.17.1 and I got the archiving facility after applying
> the patch hadoop-3307_4 patch.
> 
> This looks like a bug for me, so please let me know how to go about it.
> 
> Pratyush Banerjee
> 


Mime
View raw message