hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-532) Writable underrun in sort example
Date Thu, 14 Sep 2006 23:02:23 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-532?page=all ]

Owen O'Malley updated HADOOP-532:

    Attachment: seqfile-underread-check.patch

The compression codec is not reading the entire value buffer, but it is getting the correct
value. (I suspect the unread bytes are a crc.) This error message is the SequenceFile complaining
that the entire buffer was not used.

This patch:
  1. extends the unit test to use bigger values so that we detect the problem
  2. allows the user of the org.apache.hadoop.io.TestSequenceFile main program to control
the random seed (and prints out the seed value, even if it is random).
  3. check that the stream is done by trying to read the next byte on the input stream.
  4. removes some redundant buffering of the already buffered value stream.
  5. marks the start of the value in non-block compressed sequence files and does a reset
at the front of getCurrentValue.

> Writable underrun in sort example
> ---------------------------------
>                 Key: HADOOP-532
>                 URL: http://issues.apache.org/jira/browse/HADOOP-532
>             Project: Hadoop
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.6.1
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>             Fix For: 0.6.2
>         Attachments: seqfile-underread-check.patch
> When running the sort benchmark, I get consistent failures of this sort:
> java.lang.RuntimeException: java.io.IOException: org.apache.hadoop.io.BytesWritable@43d748ad
read 2048 bytes, should read 2052 at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:150)
at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:39) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:271)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1066) Caused by: java.io.IOException:
org.apache.hadoop.io.BytesWritable@43d748ad read 2048 bytes, should read 2052 at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1163)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1239) at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:181)
at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:147) ... 3 more

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message