hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-477) MiniYARNCluster: When container executor script fails to launch App Master, NM logs error, but Client doesn't get signaled to kill the job
Date Thu, 28 Mar 2013 23:27:16 GMT

    [ https://issues.apache.org/jira/browse/YARN-477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616817#comment-13616817
] 

Vinod Kumar Vavilapalli commented on YARN-477:
----------------------------------------------

Eli, please reopen the ticket if you run into this again. Tx.
                
> MiniYARNCluster: When container executor script fails to launch App Master, NM logs error,
but Client doesn't get signaled to kill the job
> ------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-477
>                 URL: https://issues.apache.org/jira/browse/YARN-477
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Eli Reisman
>            Assignee: Zhijie Shen
>
> I have been porting Giraph to YARN (GIRAPH-13 is the issue) and when I launch my App
Master, if the container command line runs it successfully, any failure in the App Master
or my launched Giraph Tasks promptly reports to Client and ends my job run. However, if the
command line sent to the app master container fails to launch it at all, the error exit code
is not propagating. My client hangs with the job at containersUsed == 1 and state == ACCEPTED
for as long as you want to sit and wait before CTRL-C'ing your way out.
> Disclaimer: this could be my fault. But I wanted to throw it out there in case its not.
I also (when this happens) not getting error logs since the app master never launched, so
I really have no visibility into why it failed to launch. I am sure its not launching, but
the client IS sending the app request, getting a container for my AM, and I see the command
line run on the container in my logs. Thats all.
> Thanks! If this is a dup or "won't fix" for some reason, let me know and sorry for wasting
your time!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message