hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Espen Amble Kolstad <es...@trank.no>
Subject Re: LzoCodec not working correctly?
Date Fri, 25 May 2007 07:45:33 GMT
Hi,

I changed LzoCompressor.finished() from:

  public synchronized boolean finished() {
    // ...
    return (finished && compressedDirectBuf.remaining() == 0);
  }

to:

  public synchronized boolean finished() {
    // ...
    return (finish && compressedDirectBuf.remaining() == 0);
  }

And it seems to work correctly now. I used CompressionCodecFactory.main
to test this. It failed before the change, and works after the change.
Both compress and decompress works.

Could you verify Arun? I'll do some more testing.

thanks,
Espen

Espen Amble Kolstad wrote:
> Hi Arun,
> 
> Arun C Murthy wrote:
>> Espen,
>>
>> On Thu, May 24, 2007 at 03:49:38PM +0200, Espen Amble Kolstad wrote:
>>> Hi,
>>>
>>> I've been trying to use LzoCodec to write a compressed file:
>>>
>> Could you try this command:
>> $ bin/hadoop jar build/hadoop-0.12.4-dev-test.jar testsequencefile -seed 0 -count
10000 -compressType RECORD blah.seq -codec org.apache.hadoop.io.compress.LzoCodec -check
> This works like it should:
> 07/05/25 08:29:07 INFO io.SequenceFile: count = 10000
> 07/05/25 08:29:07 INFO io.SequenceFile: megabytes = 1
> 07/05/25 08:29:07 INFO io.SequenceFile: factor = 10
> 07/05/25 08:29:07 INFO io.SequenceFile: create = true
> 07/05/25 08:29:07 INFO io.SequenceFile: seed = 0
> 07/05/25 08:29:07 INFO io.SequenceFile: rwonly = false
> 07/05/25 08:29:07 INFO io.SequenceFile: check = true
> 07/05/25 08:29:07 INFO io.SequenceFile: fast = false
> 07/05/25 08:29:07 INFO io.SequenceFile: merge = false
> 07/05/25 08:29:07 INFO io.SequenceFile: compressType = RECORD
> 07/05/25 08:29:07 INFO io.SequenceFile: compressionCodec =
> org.apache.hadoop.io.compress.LzoCodec
> 07/05/25 08:29:07 INFO io.SequenceFile: file = blah.seq
> 07/05/25 08:29:07 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 07/05/25 08:29:07 INFO compress.LzoCodec: Successfully loaded &
> initialized native-lzo library
> 07/05/25 08:29:07 INFO io.SequenceFile: creating 10000 records with
> RECORD compression
> 07/05/25 08:29:13 INFO io.SequenceFile: writing intermediate results to
> /tmp/hadoop-espen/mapred/local/intermediate.1
> 07/05/25 08:29:15 INFO io.SequenceFile: done sorting 10000 debug
> 07/05/25 08:29:15 INFO io.SequenceFile: sorting 10000 records in memory
> for debug
> 
> I think the difference, is that I try to write to the stream twice. It
> seems hadoop-code always writes all bytes at once.
> 
> The code in LzoCompressor checks for userBufLen <= 0 and sets finished =
> true, userBufLen is set in setInput(). This results in that you can only
> write to the stream once?!
> 
> - Espen
> 
>> LzoCodec seems to work fine for me... maybe your FileOutputStream was somehow corrupted?
>>
>> thanks,
>> Arun
>>
>>> public class LzoTest {
>>>
>>>   public static void main(String[] args) throws Exception {
>>>      final LzoCodec codec = new LzoCodec();
>>>      codec.setConf(new Configuration());
>>>      final CompressionOutputStream out = codec.createOutputStream(new
>>> FileOutputStream("test.lzo"));
>>>      out.write("abc".getBytes());
>>>      out.write("def".getBytes());
>>>      out.close();
>>>   }
>>> }
>>>
>>> I get the following output:
>>>
>>> 07/05/24 15:44:22 INFO util.NativeCodeLoader: Loaded the native-hadoop
>>> library
>>> 07/05/24 15:44:22 INFO compress.LzoCodec: Successfully loaded &
>>> initialized native-lzo library
>>> Exception in thread "main" java.io.IOException: write beyond end of stream
>>> 	at
>>> org.apache.hadoop.io.compress.BlockCompressorStream.write(BlockCompressorStream.java:68)
>>> 	at java.io.OutputStream.write(OutputStream.java:58)
>>> 	at no.trank.tI.LzoTest.main(LzoTest.java:19)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> 	at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> 	at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> 	at java.lang.reflect.Method.invoke(Method.java:597)
>>> 	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90)
>>>
>>> Isn't it possible to use LzoCodec for this purpose, or is this a bug?
>>>
>>> - Espen
> 


Mime
View raw message