mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Han Hui Wen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAHOUT-471) RowSimilarityJob-Mapper-EntriesToVectorsReducer failure
Date Tue, 17 Aug 2010 07:26:17 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899311#action_12899311
] 

Han Hui Wen  commented on MAHOUT-471:
-------------------------------------

in org.apache.mahout.math.Varint

  /**
   * @see #writeSignedVarLong(long, DataOutput)
   */
  public static void writeSignedVarInt(int value, DataOutput out) throws IOException {
    // Great trick from http://code.google.com/apis/protocolbuffers/docs/encoding.html#types
    writeUnsignedVarInt((value << 1) ^ (value >> 31), out);
  }

  /**
   * @see #writeUnsignedVarLong(long, DataOutput)
   */
  public static void writeUnsignedVarInt(int value, DataOutput out) throws IOException {
    while ((value & 0xFFFFFF80) != 0L) {
      out.writeByte((value & 0x7F) | 0x80);
      value >>>= 7;
    }
    out.writeByte(value & 0x7F);
  }


when value is great than 1073741824, it encodem may great than 2147483648,
but 2147483648 is not still a integer now, when we call public static void writeUnsignedVarInt(int
value, DataOutput out),
the function writeUnsignedVarInt may chuck the long 2147483648 as a integer.

> RowSimilarityJob-Mapper-EntriesToVectorsReducer  failure
> --------------------------------------------------------
>
>                 Key: MAHOUT-471
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-471
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Han Hui Wen 
>            Priority: Minor
>             Fix For: 0.4
>
>
> I used Boolean Data and SIMILARITY_TANIMOTO_COEFFICIENT
> java.io.IOException: Task: attempt_201008101359_0084_r_000000_0 - The reduce copier failed
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.io.IOException: Intermediate merge failed
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
> Caused by: java.lang.RuntimeException: java.io.EOFException
> 	at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
> 	at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
> 	at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
> 	at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
> 	at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
> 	at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> 	at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
> 	at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
> 	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
> 	... 1 more
> Caused by: java.io.EOFException
> 	at java.io.DataInputStream.readByte(DataInputStream.java:250)
> 	at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
> 	at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
> 	at org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:65)
> 	at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
> 	... 9 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message