hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anu Engineer <aengin...@hortonworks.com>
Subject Re: H9 build slave is bad
Date Wed, 08 Mar 2017 22:21:14 GMT
Hi Allen,
>	Likely something in the HDFS-7240 branch or with this patch that's doing Bad Things (tm).

Thanks for bringing this to my attention, But I am surprised that a mvn command is able to
kill a test machine.

I have pasted the call stack from the issue that you pointed out to be the root cause, can
you please help me understand what you think is the root cause?  
If anyone can give me pointers to how to access H9 machine, I would love to take a look.

From the console logs, I am not able to see why this run can kill H9 machine (Let us assume
that test is able to kill the container, but rendering H9 machine inoperable, I doubt that
it is related to the patch).
Let us for a second assume that what you are saying is true, that HDFS-7240 is somehow killing
these machines, why is this happening only on H9? Is HDFS runs happening only on H9? 


Ps. We are able to run this on local machines without any issues, I will try to run this inside
a Docker container just to make sure that these tests are not doing something weird.

Stacks from the Console Log: 

mvn -Dmaven.repo.local=/home/jenkins/yetus-m2/hadoop-HDFS-7240-patch-1 -Ptest-patch -Pparallel-tests
-P!shelltest -Pnative -Drequire.libwebhdfs -Drequire.snappy -Drequire.openssl -Drequire.fuse
-Drequire.test.libhadoop -Pyarn-ui clean test -fae > /testptch/hadoop/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
FATAL: command execution failed
java.io.IOException: Backing channel 'H9' is disconnected.
	at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:191)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:256)
	at com.sun.proxy.$Proxy104.isAlive(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1043)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1035)
	at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:154)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:108)
	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:65)
	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
	at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:779)
	at hudson.model.Build$BuildExecution.build(Build.java:205)
	at hudson.model.Build$BuildExecution.doRun(Build.java:162)
	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:534)
	at hudson.model.Run.execute(Run.java:1728)
	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
	at hudson.model.ResourceController.execute(ResourceController.java:98)
	at hudson.model.Executor.run(Executor.java:404)
Caused by: hudson.remoting.Channel$OrderlyShutdown
	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1121)
	at hudson.remoting.Channel$1.handle(Channel.java:526)
	at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:83)
Caused by: Command close created at
	at hudson.remoting.Command.<init>(Command.java:59)
	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1115)
	at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1113)
	at hudson.remoting.Channel.close(Channel.java:1273)
	at hudson.remoting.Channel.close(Channel.java:1255)
	at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1120)
	... 2 more
Build step 'Execute shell' marked build as failure
ERROR: Step ?Archive the artifacts? failed: no workspace for PreCommit-HDFS-Build #18591
No JDK named ?jdk-1.8.0? found
[description-setter] Description set: HDFS-11451
ERROR: Step ?Publish JUnit test result report? failed: no workspace for PreCommit-HDFS-Build
No JDK named ?jdk-1.8.0? found
Finished: FAILURE
These tests are indeed passing on the local boxes, so 

On 3/8/17, 12:04 PM, "Allen Wittenauer" <aw@effectivemachines.com> wrote:

>> On Mar 8, 2017, at 9:34 AM, Sean Busbey <busbey@cloudera.com> wrote:
>> Is this HADOOP-13951?
>	Almost certainly.  Here's the run that broke it again:
>	Likely something in the HDFS-7240 branch or with this patch that's doing Bad Things (tm).
>To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
>For additional commands, e-mail: common-dev-help@hadoop.apache.org

View raw message