hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang Hao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3011) NM dies because of the failure of resource localization
Date Wed, 07 Jan 2015 03:45:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267246#comment-14267246
] 

Wang Hao commented on YARN-3011:
--------------------------------

I submitted a job to oozie. In my workflow.xml, the value of the tag script is ended with
'/' by mistake.
<workflow-app xmlns="uri:oozie:workflow:0.2" name="hive-wf">
    <start to="create_hive"/>

    <action name="create_hive">
        <hive xmlns="uri:oozie:hive-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>oozie.action.sharelib.for.hive</name>
                    <value>hive2</value>
                </property>
                <property>
                    <name>oozie.launcher.action.main.class</name>
                    <value>org.apache.oozie.action.hadoop.Hive2Main</value>
                </property>
                <property>
                    <name>mapreduce.job.queuename</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <script>test_ooize_job1.sql/</script>
            <param>hivevar:dbname=offline</param>
            <param>hivevar:partition_date=20141228</param>
        </hive>
        <ok to="end"/>
        <error to="fail"/>
    </action>
    <kill name="fail">
        <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

When NM localized resource , the file "test_ooize_job1.sql/" cause a exception in function
getPathForLocalization of LocalResourcesTrackerImpl.

In function getPathForLocalization, when created Path, the second parameter will get null.
Path localPath = new Path(rPath, req.getPath().getName());

finally, the exception will cause AsyncDispatcher to shutdown the jvm.
So, I think we should handle this Exception, otherwise, it will cause lots of NMs die.

> NM dies because of the failure of resource localization
> -------------------------------------------------------
>
>                 Key: YARN-3011
>                 URL: https://issues.apache.org/jira/browse/YARN-3011
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.5.1
>            Reporter: Wang Hao
>            Assignee: Varun Saxena
>
> NM dies because of IllegalArgumentException when localize resource.
> 2014-12-29 13:43:58,699 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Downloading public rsrc:{ hdfs://hadoop002.dx.momo.com:8020/user/hadoop/share/lib/oozie/json-simple-1.1.jar,
1416997035456, FILE, null }
> 2014-12-29 13:43:58,699 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
Downloading public rsrc:{ hdfs://hadoop002.dx.momo.com:8020/user/hive/src/final_test_ooize/test_ooize_job1.sql/,
1419831474153, FILE, null }
> 2014-12-29 13:43:58,701 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in
dispatcher thread
> java.lang.IllegalArgumentException: Can not create a Path from an empty string
>         at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:135)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:94)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl.getPathForLocalization(LocalResourcesTrackerImpl.java:420)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:758)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:672)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.handle(ResourceLocalizationService.java:614)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
       
>         at java.lang.Thread.run(Thread.java:745)
> 2014-12-29 13:43:58,701 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor:
Initializing user hadoop
> 2014-12-29 13:43:58,702 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
> 2014-12-29 13:43:58,704 INFO org.apache.hadoop.mapred.ShuffleHandler: Setting connection
close header...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message