hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Twensky <jim.twen...@gmail.com>
Subject Re: getting DiskErrorException during map
Date Thu, 16 Apr 2009 19:23:23 GMT
Yes, here is how it looks:

    <property>
        <name>hadoop.tmp.dir</name>
        <value>/scratch/local/jim/hadoop-${user.name}</value>
    </property>

so I don't know why it still writes to /tmp. As a temporary workaround, I
created a symbolic link from /tmp/hadoop-jim to /scratch/...
and it works fine now but if you think this might be a considered as a bug,
I can report it.

Thanks,
Jim


On Thu, Apr 16, 2009 at 12:44 PM, Alex Loddengaard <alex@cloudera.com>wrote:

> Have you set hadoop.tmp.dir away from /tmp as well?  If hadoop.tmp.dir is
> set somewhere in /scratch vs. /tmp, then I'm not sure why Hadoop would be
> writing to /tmp.
>
> Hope this helps!
>
> Alex
>
> On Wed, Apr 15, 2009 at 2:37 PM, Jim Twensky <jim.twensky@gmail.com>
> wrote:
>
> > Alex,
> >
> > Yes, I bounced the Hadoop daemons after I changed the configuration
> files.
> >
> > I also tried setting  $HADOOP_CONF_DIR to the directory where my
> > hadop-site.xml file resides but it didn't work.
> > However, I'm sure that HADOOP_CONF_DIR is not the issue because other
> > properties that I changed in hadoop-site.xml
> > seem to be properly set. Also, here is a section from my hadoop-site.xml
> > file:
> >
> >    <property>
> >        <name>hadoop.tmp.dir</name>
> >         <value>/scratch/local/jim/hadoop-${user.name}</value>
> >     </property>
> >    <property>
> >        <name>mapred.local.dir</name>
> >         <value>/scratch/local/jim/hadoop-${user.name
> }/mapred/local</value>
> >    </property>
> >
> > I also created /scratch/local/jim/hadoop-jim/mapred/local on each task
> > tracker since I know
> > directories that do not exist are ignored.
> >
> > When I manually ssh to the task trackers, I can see the directory
> > /scratch/local/jim/hadoop-jim/dfs
> > is automatically created so is it seems like  hadoop.tmp.dir is set
> > properly. However, hadoop still creates
> > /tmp/hadoop-jim/mapred/local and uses that directory for the local
> storage.
> >
> > I'm starting to suspect that mapred.local.dir is overwritten to a default
> > value of /tmp/hadoop-${user.name}
> > somewhere inside the binaries.
> >
> > -jim
> >
> > On Tue, Apr 14, 2009 at 4:07 PM, Alex Loddengaard <alex@cloudera.com>
> > wrote:
> >
> > > First, did you bounce the Hadoop daemons after you changed the
> > > configuration
> > > files?  I think you'll have to do this.
> > >
> > > Second, I believe 0.19.1 has hadoop-default.xml baked into the jar.
>  Try
> > > setting $HADOOP_CONF_DIR to the directory where hadoop-site.xml lives.
> >  For
> > > whatever reason your hadoop-site.xml (and the hadoop-default.xml you
> > tried
> > > to change) are probably not being loaded.  $HADOOP_CONF_DIR should fix
> > > this.
> > >
> > > Good luck!
> > >
> > > Alex
> > >
> > > On Mon, Apr 13, 2009 at 11:25 AM, Jim Twensky <jim.twensky@gmail.com>
> > > wrote:
> > >
> > > > Thank you Alex, you are right. There are quotas on the systems that
> I'm
> > > > working. However, I tried to change mapred.local.dir as follows:
> > > >
> > > > --inside hadoop-site.xml:
> > > >
> > > >    <property>
> > > >        <name>mapred.child.tmp</name>
> > > >        <value>/scratch/local/jim</value>
> > > >    </property>
> > > >    <property>
> > > >        <name>hadoop.tmp.dir</name>
> > > >        <value>/scratch/local/jim</value>
> > > >    </property>
> > > >    <property>
> > > >        <name>mapred.local.dir</name>
> > > >        <value>/scratch/local/jim</value>
> > > >    </property>
> > > >
> > > >  and observed that the intermediate map outputs are still being
> written
> > > > under /tmp/hadoop-jim/mapred/local
> > > >
> > > > I'm confused at this point since I also tried setting these values
> > > directly
> > > > inside the hadoop-default.xml and that didn't work either. Is there
> any
> > > > other property that I'm supposed to change? I tried searching for
> > "/tmp"
> > > in
> > > > the hadoop-default.xml file but couldn't find anything else.
> > > >
> > > > Thanks,
> > > > Jim
> > > >
> > > >
> > > > On Tue, Apr 7, 2009 at 9:35 PM, Alex Loddengaard <alex@cloudera.com>
> > > > wrote:
> > > >
> > > > > The getLocalPathForWrite function that throws this Exception
> assumes
> > > that
> > > > > you have space on the disks that mapred.local.dir is configured on.
> > >  Can
> > > > > you
> > > > > verify with `df` that those disks have space available?  You might
> > also
> > > > try
> > > > > moving mapred.local.dir off of /tmp if it's configured to use /tmp
> > > right
> > > > > now; I believe some systems have quotas on /tmp.
> > > > >
> > > > > Hope this helps.
> > > > >
> > > > > Alex
> > > > >
> > > > > On Tue, Apr 7, 2009 at 7:22 PM, Jim Twensky <jim.twensky@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm using Hadoop 0.19.1 and I have a very small test cluster
with
> 9
> > > > > nodes,
> > > > > > 8
> > > > > > of them being task trackers. I'm getting the following error
and
> my
> > > > jobs
> > > > > > keep failing when map processes start hitting 30%:
> > > > > >
> > > > > > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
not
> > find
> > > > any
> > > > > > valid local directory for
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> taskTracker/jobcache/job_200904072051_0001/attempt_200904072051_0001_m_000000_1/output/file.out
> > > > > >        at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:335)
> > > > > >        at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
> > > > > >        at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:61)
> > > > > >        at
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1209)
> > > > > >        at
> > > > > >
> > > >
> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:867)
> > > > > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> > > > > >        at org.apache.hadoop.mapred.Child.main(Child.java:158)
> > > > > >
> > > > > >
> > > > > > I googled many blogs and web pages but I could neither understand
> > why
> > > > > this
> > > > > > happens nor found a solution to this. What does that error
> message
> > > mean
> > > > > and
> > > > > > how can avoid it, any suggestions?
> > > > > >
> > > > > > Thanks in advance,
> > > > > > -jim
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message