hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "brad" <b...@bcs-mail.net>
Subject Question on MapFile.java and SequenceFile.createWriter
Date Wed, 01 Sep 2010 03:21:04 GMT
I apologize in advanced if this is a dumb question.  I'm new to Hadoop and
I'm using it with Nutch.  Anyhow, I was reviewing my log files trying to
find out why zlib was being used instead of my specified codec and I found
the following:

In hadoop 0.20.2, MapFile.java  - line 156 - 163

For this.data SequenceFile.createWriter uses the user supplied codec for
      this.data =
        (fs, conf, dataFile, keyClass, valClass, compress, codec, progress);

However, on the next line, the this.index does not.  As a result, it
automatically uses the "new DefaultCodec()" which is zlib instead of the
user supplied codec.
      this.index =
        (fs, conf, indexFile, keyClass, LongWritable.class,
         CompressionType.BLOCK, progress);

Is there a reason why this occurs?  I looked for some explanation but I was
unable to find any?  Is it because the same codec can't be used again in
another createWriter?  If that is the case, why not create another instance
of the same codec?

I checked and this is in the current trunk as well.

Did I miss something?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message