hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srinivas Surasani <hivehadooplearn...@gmail.com>
Subject Hadoop streaming job failure
Date Sun, 21 Jul 2013 05:56:48 GMT
Hi All,

I'm running hadoop streaming job over 100 GB of data on 50 node cluster.
Job succeeds for the small amounts of data. But when running on 100 GB of
data, I get "memory error" and "BrokenPipe " error. I have enough memory on
each node.

Is there a way to increase the memory for python streaming tasks ?

below are sample error logs

cause:java.io.IOException: subprocess still running
R/W/S=32771708/10/0 in:34752=32771708/943 [rec/s] out:0=10/943 [rec/s]
minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
HOST=null
USER=root
HADOOP_USER=null
last Hadoop input: |null|
Broken pipe


Any help appreciated.

Thanks,
Srinivas

Mime
View raw message