mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Reduce Copier Failed
Date Mon, 20 Dec 2010 13:33:37 GMT
If it had failed for lack of memory you'd almost surely see an
OutOfMemoryError -- unless the framework swallowed it or something.

Have you set mapreduce.task.io.sort.mb and
mapreduce.task.io.sort.factor ? (There are slightly different names
for these in past versions of Hadoop.) These control how much of the
worker's memory is reserved for merges, and how much the output is
split for merging. Could help.

But at first glance this is a Hadoop problem, or at least
configuration issue, not Mahout.

On Mon, Dec 20, 2010 at 1:27 PM, Niall Riddell <niall.riddell@xspca.com> wrote:
>
> Hi,
>
> Got the following error when running the full Wikipedia links example (using
> RecommenderJob) after the 3rd day of execution:
>
> 10/12/19 02:24:08 INFO mapred.JobClient:  map 100% reduce 29%
> 10/12/19 02:32:29 INFO mapred.JobClient: Task Id :
> attempt_201012151738_0012_r_000002_0, Status : FAILED
> java.io.IOException: Task: attempt_201012151738_0012_r_000002_0 - The reduce
> copier failed
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:380)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.io.IOException: Intermediate merge failed
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2576)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2501)
> Caused by: java.lang.RuntimeException: java.io.EOFException
> at
> org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
> at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
> at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123)
> at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447)
> at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:107)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:93)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2551)
> ... 1 more
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
> at org.apache.mahout.math.Varint.readUnsignedVarInt(Varint.java:159)
> at org.apache.mahout.math.Varint.readSignedVarInt(Varint.java:140)
> at
> org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey.readFields(SimilarityMatrixEntryKey.java:64)
> at
> org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
> ... 9 more
>
> I was running this on a local hadoop installation 20.2 and I allocated 1GB
> heap for 8 mapreduce mappers and reducers using an 8 core server with 20GB
> ram.
>
> Reckon the workers may have run out of memory as it appears to have failed
> when doing some in memory operations.
>
> If it's of any use to anybody I can upload the the log files for diagnostics
> to S3.
>
> Cheers
> --
> Niall Riddell

Mime
View raw message