hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "abhishek bharani (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6914) Application application_1501553373419_0001 failed 2 times due to AM Container for appattempt_1501553373419_0001_000002 exited with exitCode: -1000
Date Tue, 01 Aug 2017 15:00:03 GMT

    [ https://issues.apache.org/jira/browse/YARN-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16109048#comment-16109048
] 

abhishek bharani commented on YARN-6914:
----------------------------------------

Below is the information from NM Logs :

2017-08-01 10:19:50,510 ERROR org.apache.spark.network.util.LevelDBProvider: error opening
leveldb file /usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb.  Creating new file,
will not be able to recover state for existing applications
org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb/LOCK:
No such file or directory
	at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
	at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
	at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
	at org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:48)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:116)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:94)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:65)
	at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:166)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:261)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:495)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:543)
2017-08-01 10:19:50,511 WARN org.apache.spark.network.util.LevelDBProvider: error deleting
/usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb
2017-08-01 10:19:50,511 INFO org.apache.hadoop.service.AbstractService: Service spark_shuffle
failed in state INITED; cause: java.io.IOException: Unable to create state store
java.io.IOException: Unable to create state store
	at org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:77)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:116)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.<init>(ExternalShuffleBlockResolver.java:94)
	at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.<init>(ExternalShuffleBlockHandler.java:65)
	at org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:166)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:245)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:261)
	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:495)
	at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:543)
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: IO error: /usr/local/hadoop/tmp/nm-local-dir/registeredExecutors.ldb/LOCK:
No such file or directory
	at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200)
	at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218)
	at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
	at org.apache.spark.network.util.LevelDBProvider.initLevelDB(LevelDBProvider.java:75)
	... 15 more
2017-08-01 10:19:50,513 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Using ResourceCalculatorPlugin : null


> Application application_1501553373419_0001 failed 2 times due to AM Container for appattempt_1501553373419_0001_000002
exited with exitCode: -1000
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-6914
>                 URL: https://issues.apache.org/jira/browse/YARN-6914
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 2.7.3
>         Environment: Mac OS
>            Reporter: abhishek bharani
>            Priority: Critical
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> I am getting below error while running 
> spark-shell --master yarn
> Application application_1501553373419_0001 failed 2 times due to AM Container for appattempt_1501553373419_0001_000002
exited with exitCode: -1000
> For more detailed output, check application tracking page:http://abhisheks-mbp:8088/cluster/app/application_1501553373419_0001Then,
click on links to logs of each attempt.
> Diagnostics: null
> Failing this attempt. Failing the application.
> Below are the contents of yarn-site.xml :
> <configuration>
>         <!-- Site specific YARN configuration properties -->
>         <property>
>                 <name>yarn.nodemanager.aux-services</name>
>                 <value>mapreduce_shuffle</value>
>         </property>
>        <property>
>                 <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
>                 <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>        </property>
>         <property>
>                 <name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
>                 <value>org.apache.spark.network.yarn.YarnShuffleService</value>
>         </property>
>         <property>
>                 <name>yarn.log-aggregation-enable</name>
>                 <value>true</value>
>         </property>
>         <property>
>                 <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name>
>                 <value>3600</value>
>         </property>
>         <property>
>                 <name>yarn.resourcemanager.hostname</name>
>                 <value>localhost</value>
>         </property>
>         <property>
>                         <name>yarn.resourcemanager.resourcetracker.address</name>
>                         <value>${yarn.resourcemanager.hostname}:8025</value>
>                         <description>Enter your ResourceManager hostname.</description>
>         </property>
>         <property>
>                         <name>yarn.resourcemanager.scheduler.address</name>
>                         <value>${yarn.resourcemanager.hostname}:8035</value>
>                         <description>Enter your ResourceManager hostname.</description>
>         </property>
>         <property>
>                         <name>yarn.resourcemanager.address</name>
>                         <value>${yarn.resourcemanager.hostname}:8055</value>
>                         <description>Enter your ResourceManager hostname.</description>
>         </property>
>         <property>
>                         <description>The http address of the RM web application.</description>
>                         <name>yarn.resourcemanager.webapp.address</name>
>                         <value>${yarn.resourcemanager.hostname}:8088</value>
>         </property>
> I tried many solutions but none of them is working :
> 1.Added property yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
to yarn-site.xml with value as 98.5
> 2.added below property to yarn-site.xml yarn.nodemanager.aux-services.spark_shuffle.class
org.apache.spark.network.yarn.YarnShuffleService  
> 3.Added property in spark-defaults.conf spark.yarn.jars=hdfs://localhost:50010/users/spark/jars/*.jar



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message