hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debbie Fu <fuyulin...@gmail.com>
Subject Re: large intermediate outputs
Date Mon, 03 Jan 2011 13:11:06 GMT
I think it will cause a disk fill-up, too. Is there any mechanism in Hadoop
that handles this situation? If my local disk stores too much chunk data,
and spare little space for intermediate output, and all nodes are in this
situation that we can't schedule the task on another node that could have
the space for intermediate output, so what does the hadoop do ? Does the job
simply fail? Can I set a remote disk in mapred.local.dir?

2011/1/3 Harsh J <qwertymaniac@gmail.com>

> Additionally, you can set mapred.local.dir to be a comma-separated
> list of paths that reside on multiple disks -- this spreads I/O plus
> gives you additional space.
>
> But I suppose if a single Mapper is writing a huge amount of data for
> a single partition output, it may cause a disk fill-up. Please correct
> me if am wrong here.
>
> On Mon, Jan 3, 2011 at 5:58 PM, Debbie Fu <fuyulin365@gmail.com> wrote:
> > Hi,
> > Is there any possibility that the intermediate output might be too large
> to
> > store it in the local disk?
> > If there is, what does hadoop do to solve the problem?
> > Thanks.
> >
> > --
> > Best regards!
> >
> >
>
>
>
> --
> Harsh J
> www.harshj.com
>



-- 
Best regards!

Yulin Fu
SUN YAT-SEN UNIVERSITY
Mobile:13570409599
QQ:642786040

Mime
View raw message