hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "huozhanfeng (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-2231) Provide feature to limit MRJob's stdout/stderr size
Date Sun, 29 Jun 2014 08:45:26 GMT

     [ https://issues.apache.org/jira/browse/YARN-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

huozhanfeng updated YARN-2231:
------------------------------

    Description: 
When a MRJob print too much stdout or stderr log, the disk will be filled. Now it has influence
our platform management.

I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come from@org.apache.hadoop.mapred.TaskLog)
to generate the execute cmd
as follows:
exec /bin/bash -c "( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
 -Xmx1024m -Djava.io.tmpdir=$PWD/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002
-Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA org.apache.hadoop.mapred.YarnChild
10.106.24.108 53911 attempt_1403930653208_0003_m_000000_0 2 | tail -c 102 >/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002/stdout
; exit $PIPESTATUS ) 2>&1 |  tail -c 10240 >/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002/stderr
; exit $PIPESTATUS "

But it doesn't take effect.

And then, when I use "export YARN_NODEMANAGER_OPTS=-Xdebug -Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y"
for debuging NodeManager, I find when I set the BreakPoints at org.apache.hadoop.util.Shell(line
450:process = builder.start()) and org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
161:List<String> newCmds = new ArrayList<String>(command.size())) the cmd will
work.

I doubt there's concurrency problem caused  pipe shell will not perform properly. It matters,
and I need your help.

my email: huozhanfeng@gmail.com

thanks

  was:
When a MRJob print too much stdout or stderr log, the disk will be filled. Now it has influence
our platform management.

I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come from@org.apache.hadoop.mapred.TaskLog)
to generate the execute cmd
as follows:
exec /bin/bash -c "( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
 -Xmx1024m -Djava.io.tmpdir=$PWD/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002
-Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA org.apache.hadoop.mapred.YarnChild
10.106.24.108 53911 attempt_1403930653208_0003_m_000000_0 2 | tail -c 102 >/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002/stdout
; exit $PIPESTATUS ) 2>&1 |  tail -c 10240 >/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002/stderr
; exit $PIPESTATUS "

But it doesn't take effect.

And then, when I use "export YARN_NODEMANAGER_OPTS=-Xdebug -Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y"
for debuging NodeManager, I find when I set the BreakPoints at org.apache.hadoop.util.Shell(line
450:process = builder.start()) and org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
161:List<String> newCmds = new ArrayList<String>(command.size())) the cmd will
work.

I doubt there's concurrency problem caused with pipe shell will not perform properly. It matters,
and I need your help.

my email: huozhanfeng@gmail.com

thanks


> Provide feature  to limit MRJob's stdout/stderr size
> ----------------------------------------------------
>
>                 Key: YARN-2231
>                 URL: https://issues.apache.org/jira/browse/YARN-2231
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: log-aggregation, nodemanager
>    Affects Versions: 2.3.0
>         Environment: CentOS release 5.8 (Final)
>            Reporter: huozhanfeng
>              Labels: features
>
> When a MRJob print too much stdout or stderr log, the disk will be filled. Now it has
influence our platform management.
> I have improved org.apache.hadoop.mapred.MapReduceChildJVM.java(come from@org.apache.hadoop.mapred.TaskLog)
to generate the execute cmd
> as follows:
> exec /bin/bash -c "( $JAVA_HOME/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
 -Xmx1024m -Djava.io.tmpdir=$PWD/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002
-Dyarn.app.container.log.filesize=10240 -Dhadoop.root.logger=DEBUG,CLA org.apache.hadoop.mapred.YarnChild
10.106.24.108 53911 attempt_1403930653208_0003_m_000000_0 2 | tail -c 102 >/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002/stdout
; exit $PIPESTATUS ) 2>&1 |  tail -c 10240 >/logs/userlogs/application_1403930653208_0003/container_1403930653208_0003_01_000002/stderr
; exit $PIPESTATUS "
> But it doesn't take effect.
> And then, when I use "export YARN_NODEMANAGER_OPTS=-Xdebug -Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y"
for debuging NodeManager, I find when I set the BreakPoints at org.apache.hadoop.util.Shell(line
450:process = builder.start()) and org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch(line
161:List<String> newCmds = new ArrayList<String>(command.size())) the cmd will
work.
> I doubt there's concurrency problem caused  pipe shell will not perform properly. It
matters, and I need your help.
> my email: huozhanfeng@gmail.com
> thanks



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message