hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Venky Shankar <yknev.shan...@gmail.com>
Subject mapred.job.split.file not present in job conf
Date Mon, 23 May 2011 07:08:11 GMT
Hey folks,

I am writing a hadoop plugin (somewhat like FTPFileSystem) so as to run
Map/Reduce jobs on data stored on my backing store. The jar (containing the
FileSystem implementation) is copied in hadoop's lib/ directory and the
necessary changes to conf/core-site.xml and conf/mapred-site.xml is done so
as to load the jar when a Map/Reduce job is run. I am using hadoop-0.20.2.

After starting start-mapred.sh script i run a sample Map/Reduce application
('Grep' example that ships with hadoop's distribution), during which I get
the following error (in JobTracker logs)

2011-05-23 11:25:17,464 ERROR org.apache.hadoop.mapred.JobTracker: Job
initialization failed:
java.lang.IllegalArgumentException: Can not create a Path from a null string
        at org.apache.hadoop.fs.Path.checkPathArg(Path.java:78)
        at org.apache.hadoop.fs.Path.<init>(Path.java:90)
        at
org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:417)
        at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3150)
        at
org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)

I can see the split file (job.split) and the xml conf file (job.xml) getting
created in the backing store (inside:
$SYSTEMDIR/tmp/hadoop-<user>/mapred/system/job_XXXXXXX_XXXX/). It looks like
initTasks (inside JobInProgress) gets an null string from the job conf (for
'mapred.job.split.file') as seen from the backtrace above. But the entry is
present in the xml file:

$ grep mapred.job.split job.xml
<property><name>mapred.job.split.file</name><value>dummyfs://host:<port>/tmp/hadoop-<user>/mapred/system/job_201105231124_0001/job.split</value></property>

Any pointers/tips on how to debug this further. Am i missing something that
could cause this kind of behavior.  Also does the JobTracker gets the
configurations from the xml file (job.xml ?) or from somewhere else (so the
above entry in the xml file does not matter).

Thanks,
-Venky

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message