hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Savage (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-3981) FileNotFoundException during Mapper.setup for file embedded in job jar
Date Wed, 07 Mar 2012 15:52:58 GMT
FileNotFoundException during Mapper.setup for file embedded in job jar
----------------------------------------------------------------------

                 Key: MAPREDUCE-3981
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3981
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 1.0.0
            Reporter: David Savage


I have a job that is packaged as a jar and contains a configuration file in the root of the
jar "RankingRules.xml".

During the mapping I'm loading this file at Mapper.setup using (effectively) ClassLoader.getResource("RankingRules.xml").openStream()

This works for a while then fails repeatedly 20-40 mins into the job causing the maximum retry
failures to trigger and then the job aborts.

The failure is caused by:

java.io.FileNotFoundException: /tmp/hadoop-hadoop/hadoop-unjar695679302321435361/RankingRules.xml
(No such file or directory)
	at java.io.FileInputStream.open(Native Method)
	at java.io.FileInputStream.<init>(FileInputStream.java:138)
	at java.io.FileInputStream.<init>(FileInputStream.java:97)
	at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
	at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
	at java.net.URL.openStream(URL.java:1035)
	... 12 more

I'm thinking a good approach is to change the code so it loads the RankingRules from an HDFS
url instead of embedding them in the jar.

However I'm a little baffled as to why this should change between tasks. This job is creating
about 600 map tasks and on the last run it failed after about 300 successful maps.

At the moment this is running in psuedo distributed mode so it's just one machine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message