hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johannes Zillmann <jzillm...@googlemail.com>
Subject Re: "IOException: Filesystem closed." when trying to commit reduce output.
Date Thu, 01 Jul 2010 14:27:30 GMT
Hi Marcin,

did you solve this error ? I stumbled into the same thing also i have no NFS involved...

Johannes

> Hi there,
> 
> I've got a simple Map Reduce application that works perfectly when I use 
> NFS as an underlying filesystem (not using HDFS at all).
> I've got a working HDFS configuration as well - grep example works for 
> me with this configuration.
> 
> However, when I try to run the same application on HDFS instead of NFS I 
> keep recieving "IOException: Filesystem closed." exception and the job 
> fails.
> I've spent a day searching for a solution with Google and scanning thru 
> old archieves but no results so far...
> 
> Job summary is:
> --->output
> 10/05/26 17:29:13 INFO mapred.JobClient: Job complete: job_201005261710_0002
> 10/05/26 17:29:13 INFO mapred.JobClient: Counters: 4
> 10/05/26 17:29:13 INFO mapred.JobClient:   Job Counters
> 10/05/26 17:29:13 INFO mapred.JobClient:     Rack-local map tasks=12
> 10/05/26 17:29:13 INFO mapred.JobClient:     Launched map tasks=16
> 10/05/26 17:29:13 INFO mapred.JobClient:     Data-local map tasks=4
> 10/05/26 17:29:13 INFO mapred.JobClient:     Failed map tasks=1
> 
> Each map task's attempt log reads somehow like:
> --->attempt_201005261710_0001_m_000000_3/syslog:
> 2010-05-26 17:13:47,297 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: 
> Initializing JVM Metrics with processName=MAP, session
> Id=
> 2010-05-26 17:13:47,470 INFO org.apache.hadoop.mapred.MapTask: 
> io.sort.mb = 100
> 2010-05-26 17:13:47,688 INFO org.apache.hadoop.mapred.MapTask: data 
> buffer = 79691776/99614720
> 2010-05-26 17:13:47,688 INFO org.apache.hadoop.mapred.MapTask: record 
> buffer = 262144/327680
> 2010-05-26 17:13:47,712 INFO org.apache.hadoop.mapred.MapTask: Starting 
> flush of map output
> 2010-05-26 17:13:47,784 INFO org.apache.hadoop.mapred.MapTask: Finished 
> spill 0
> 2010-05-26 17:13:47,788 INFO org.apache.hadoop.mapred.TaskRunner: 
> Task:attempt_201005261710_0001_m_000000_3 is done. And is i
> n the process of commiting
> 2010-05-26 17:13:47,797 WARN org.apache.hadoop.mapred.TaskTracker: Error 
> running child
> *java.io.IOException: Filesystem closed
>    at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226)
>    at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:617)
>    at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453)
>    at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648)
>    at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.needsTaskCommit(FileOutputCommitter.java:217)
>    at org.apache.hadoop.mapred.Task.done(Task.java:671)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:309)
>    at org.apache.hadoop.mapred.Child.main(Child.java:170)*
> 2010-05-26 17:13:47,802 INFO org.apache.hadoop.mapred.TaskRunner: 
> Runnning cleanup for the task
> 2010-05-26 17:13:47,802 WARN 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Error 
> discarding output*
> java.io.IOException: Filesystem closed
>    at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226)
>    at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:580)
>    at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:227)
>    at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.abortTask(FileOutputCommitter.java:179)
>    at org.apache.hadoop.mapred.Task.taskCleanup(Task.java:815)
>    at org.apache.hadoop.mapred.Child.main(Child.java:191)*
> 
> There are no reduce task run, as map tasks haven't managed to save their 
> solution.
> 
> This exceptions are visible in JobTracker's log as well. What is the 
> reason for this excpetion? Is it critical (I guess it is, but it's 
> listed in JobTracker's log as INFO not ERROR).
> 
> My config (I'm not sure which directories should be local and which 
> located on HDFS, maybe the issue is somewhere here?):
> 
> ---->core-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> 
> <configuration>
> <property>
> <name>fs.default.name</name>
> <value>hdfs://blade02:5432/</value>
> </property>
> <property>
> <name>hadoop.tmp.dir</name>
> <value>/tmp/hadoop/tmp</value> <!-- local -->
> </property>
> 
> </configuration>
> 
> ---->hdfs-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> 
> <configuration>
> <property>
> <name>dfs.replication</name>
> <value>1</value>
> </property>
> <property>
> <name>dfs.name.dir</name>
> <value>/tmp/hadoop/name2</value> <!-- local dir where HDFS is located-->
> </property>
> <property>
> <name>dfs.data.dir</name>
> <value>/tmp/hadoop/data</value> <!-- local dir where HDFS is located -->
> </property>
> </configuration>
> 
> ---->mapred-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> 
> <configuration>
> <property>
> <name>mapred.job.tracker</name>
> <value>blade02:5435</value>
> </property>
> <property>
> <name>mapred.temp.dir</name>
> <value>mapred_tmp</value> <!-- on HDFS I suppose -->
> </property>
> <property>
> <name>mapred.system.dir</name>
> <value>system</value> <!-- on HDFS I suppose -->
> </property>
> <property>
> <name>mapred.local.dir</name>
> <value>/tmp/hadoop/local</value> <!-- local -->
> </property>
> <property>
> <name>mapred.task.tracker.http.address</name>
> <value>0.0.0.0:0</value>
> </property>
> <property>
> <name>mapred.textoutputformat.separator</name>
> <value>,</value>
> </property>
> </configuration>
> 
> I'm using Hadoop 0.20.2 (new API -> org.apache.hadoop.mapreduce.*, 
> default OutputFormat and RecordWriter), running on a 3-node cluster 
> (blade02, blade03, blade04). blade02 is a master, all of them are 
> slaves. My OS: Linux blade02 2.6.9-42.0.2.ELsmp #1 SMP Tue Aug 22 
> 17:26:55 CDT 2006 i686 i686 i386 GNU/Linux.
> 
> Note that there are currently 3 filesystems in my configuration:
> /tmp/* - is a local fs for each processor
> /home/* - as the NFS common for all processors    - this is where the 
> hadoop is installed
> hdfs://blade02:5432/* - HDFS
> 
> I'm not sure if this is relevant, but intermediate (key, value) pair is 
> of type (Text, TermVector), and TermVector Writable methods are 
> implemented like this:
>    public class TermVector implements Writable {
>            private Map<Text, IntWritable> vec = new HashMap<Text, 
> IntWritable>();
> 
>            @Override
>            public void write(DataOutput out) throws IOException {
>                    out.writeInt(vec.size());
>                    for (Map.Entry<Text, IntWritable> e : 
> vec.entrySet()) {
>                            e.getKey().write(out);
>                            e.getValue().write(out);
>                    }
>            }
> 
>            @Override
>            public void readFields(DataInput in) throws IOException {
>                    int n = in.readInt();
>                    for (int i = 0; i < n; ++i) {
>                            Text t = new Text();
>                            t.readFields(in);
>                            IntWritable iw = new IntWritable();
>                            iw.readFields(in);
>                            vec.put(t, iw);
>                    }
>            }
> ...
> }
> 
> Any help appreciated.
> 
> Many thanks,
> Marcin Sieniek
> 
> 



Mime
View raw message