hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang" <hair...@yahoo-inc.com>
Subject RE: compressed input files
Date Fri, 03 Aug 2007 17:37:36 GMT
A compressed input file does not get partitioned, so the number of mappers
is equal to the number of input files.

Hairong 

-----Original Message-----
From: Dennis Kubes [mailto:kubes@apache.org] 
Sent: Wednesday, August 01, 2007 11:48 PM
To: hadoop-user@lucene.apache.org
Subject: Re: compressed input files

I don't know about your record count but the link error means that you don't
have the right version of glibc that was used to compile the hadoop native
libraries.  It shouldn't matter though as hadoop will fall back to the java
versions if the native can't be used.

Dennis Kubes

Sandhya E wrote:
> Hi
> 
> I'm trying to pass .gz files as input to hadoop, and at the end of 
> mapreduce, the number of input records read from the input files is 
> around 480, and when I uncompress the files, the number of input 
> records read is around 3000. Why such a difference. Also there is a 
> warning mesg during start of execution:
> 07/08/01 23:18:55 DEBUG util.NativeCodeLoader: Trying to load the 
> custom-built native-hadoop library...
> 07/08/01 23:18:55 DEBUG util.NativeCodeLoader: Failed to load 
> native-hadoop with error: java.lang.UnsatisfiedLinkError:
> /local/offline2/hadoop-0.13.0/lib/native/Linux-i386-32/libhadoop.so:
> /lib/tls/libc.so.6: version `GLIBC_2.4' not found (required by
> /local/offline2/hadoop-0.13.0/lib/native/Linux-i386-32/libhadoop.so)
> 
> Can this be the reason.
> 
> Many Thanks
> Sandhya
> 


Mime
View raw message