hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vasilis Liaskovitis <vlias...@gmail.com>
Subject Re: ClassCastException in lzo indexer
Date Wed, 03 Feb 2010 00:41:27 GMT
Hi Todd,

On Tue, Feb 2, 2010 at 1:41 PM, Todd Lipcon <todd@cloudera.com> wrote:
> Hi Vasilis,
>
> Did you make sure to "ant clean" before rebuilding hadoop-lzo if you updated
> the code? Also, can you paste your configuration for io.compression.codecs ?
>

I rebuilt after doing "ant clean" but still get the same behaviour.
Here's my codecs config:

<property>
    <name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
</property>

<property>
    <name>io.compression.codec.lzo.class</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
</property>

my error sounds lzop-specific, maybe my io.compression.codec.lzo.class
should include something about lzop?

thanks,

- Vasilis

> Thanks
> -Todd
>
> On Tue, Feb 2, 2010 at 9:09 AM, Vasilis Liaskovitis <vliaskov@gmail.com>wrote:
>
>> Hi,
>>
>> I am trying to use hadoop-0.20.1 and hadoop-lzo
>> (http://github.com/kevinweil/hadoop-lzo) to index an lzo file. I 've
>> followed the instructions and copied both jar and native libs in my
>> classpaths. I am getting this error in both local and distributed
>> indexer mode
>>
>> bin/hadoop jar lib/hadoop-lzo-0.3.0.jar
>> com.hadoop.compression.lzo.LzoIndexer /data/userVisits.lzo
>>
>> 10/02/02 17:30:38 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
>> 10/02/02 17:30:38 INFO lzo.LzoCodec: Successfully loaded & initialized
>> native-lzo library
>> 10/02/02 17:30:38 INFO lzo.DistributedLzoIndexer: Adding LZO file
>> /data/UserVisits.lzo to indexing list (no index currently exists)
>> 10/02/02 17:30:38 WARN mapred.JobClient: Use GenericOptionsParser for
>> parsing the arguments. Applications should implement Tool for the
>> same.
>> 10/02/02 17:30:39 INFO input.FileInputFormat: Total input paths to process
>> : 1
>> 10/02/02 17:30:39 INFO mapred.JobClient: Running job: job_201002020748_0409
>> 10/02/02 17:30:40 INFO mapred.JobClient:  map 0% reduce 0%
>> 10/02/02 17:31:02 INFO mapred.JobClient: Task Id :
>> attempt_201002020748_0409_m_000000_0, Status : FAILED
>> java.lang.ClassCastException:
>> com.hadoop.compression.lzo.LzopCodec$LzopDecompressor cannot be cast
>> to com.hadoop.compression.lzo.LzopDecompressor
>>
>> same error for local indexer:
>>
>> user2@amdqc08:~/hadoop-0.20.1-prof> bin/hadoop jar
>> lib/hadoop-lzo-0.3.0.jar com.hadoop.compression.lzo.LzoIndexer
>> /data/UserVisits.lzo
>> 10/02/02 17:38:47 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
>> 10/02/02 17:38:47 INFO lzo.LzoCodec: Successfully loaded & initialized
>> native-lzo library
>> 10/02/02 17:38:47 INFO lzo.LzoIndexer: [INDEX] LZO Indexing file
>> /data/UserVisits.lzo, size 9.94 GB...
>> Exception in thread "main" java.lang.ClassCastException:
>> com.hadoop.compression.lzo.LzopCodec$LzopDecompressor cannot be cast
>> to com.hadoop.compression.lzo.LzopDecompressor
>>
>> is hadoop-0.20.1 compatible with the git/master of hadoop-lzo? Or do I
>> need to use some older version of hadoop-lzo to be compatible with
>> hadoop-0.20.1?
>>
>> - a different, but relevant question: In order to compress
>> intermediate map outputs with lzo and process them in an efficient
>> way, does the map/reduce job need to explicitly create index files for
>> the compressed intermediate files? I think that at this shuffle stage,
>> input files have already been split and we are not relying on indexing
>> the intermediate lzo files for parallel shuffling. Is that correct? Or
>> would a job need to index the intermediate files? If yes, can this be
>> handled in an automatic fashion by hadoop-lzo?
>>
>> any suggestions are welcome.
>> thanks,
>>
>> - Vasilis
>>
>

Mime
View raw message