hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miles Osborne" <mi...@inf.ed.ac.uk>
Subject Hadoop-2438
Date Tue, 22 Jan 2008 14:26:21 GMT
Has there been any progress / a work-around for this?

Currently I'm experimenting with Streaming and I've encountered what looks
like the same problem as described here:

https://issues.apache.org/jira/browse/HADOOP-2438

So, I get much the same errors (see below).

For this particular task, when I replace the mappers and reducers with the
identity operation (ie just pass through the data) all is well.  When
instead I try to do something more taxing
(in this case, gathering together all ngrams with the same prefix), I get
these errors.

My guess is that this is something to do with caching / buffering, since I
presume that when the Stream mapper has real work to do, the associated Java
streamer buffers input until the Mapper signals that it can process more
data.  If the Mapper is busy, then a lot of data would get cached, causing
some internal buffer to overflow.

Miles

>

Date: Tue Jan 22 14:12:28 GMT 2008
java.io.IOException: Broken pipe
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:260)
	at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
	at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:124)
	at java.io.DataOutputStream.flush(DataOutputStream.java:106)
	at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:96)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)


	at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:107)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)

java.io.IOException: MROutput/MRErrThread
failed:java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOf(Arrays.java:2786)
	at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
	at java.io.DataOutputStream.write(DataOutputStream.java:90)
	at org.apache.hadoop.io.Text.write(Text.java:243)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:349)
	at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:344)

	at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:76)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)

java.io.IOException: MROutput/MRErrThread
failed:java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOf(Arrays.java:2786)
	at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
	at java.io.DataOutputStream.write(DataOutputStream.java:90)
	at org.apache.hadoop.io.Text.write(Text.java:243)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:349)
	at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:344)

	at org.apache.hadoop.streaming.PipeMapper.map(PipeMapper.java:76)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message