hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Hammerbacher <ham...@cloudera.com>
Subject Re: io.sort.mb configuration?
Date Wed, 23 Dec 2009 01:34:37 GMT
Hey Mark,

While you're grokking this aspect of MapReduce's configuration, you may want
to check out https://issues.apache.org/jira/browse/MAPREDUCE-64, which is on
its way into trunk right now. Chris Douglas from Yahoo! has posted a very
nice explanation of how buffers are managed during the shuffle and which
parameters affect the behavior.

Regards,
Jeff

On Tue, Dec 22, 2009 at 12:30 PM, Mark Vigeant <mark.vigeant@riskmetrics.com
> wrote:

> Thank you for the responses guys!
>
> First, to Patrick, I didn't set it in the code, though I will try it
> because that's a really good idea to set it there, so I shall play around
> with that.
>
> Long: I should have clarified, I am using 0.20.1, and so this is a bit
> different. I set the parameter in mapred-site.xml and for some reason it's
> just not getting implemented. Thank you anyways, though!
>
> -Mark
>
> -----Original Message-----
> From: Long Van Nguyen Dinh [mailto:muntron@gmail.com]
> Sent: Tuesday, December 22, 2009 12:17 PM
> To: common-user@hadoop.apache.org
> Subject: Re: io.sort.mb configuration?
>
> Hadoop has a default file (hadoop-default.xml - version 19) for all
> configuration, don't change the values in that file (they won't be
> affected), copy the parameter to the file hadoop-site.xml where you
> set up the cluster and set the value you want there.
>
> Long Van
>
> On Tue, Dec 22, 2009 at 11:40 AM, Patrick Angeles
> <patrickangeles@gmail.com> wrote:
> > You can also set that param per-job. Maybe you called some code that did
> > that behind the scenes?
> >
> > On Tue, Dec 22, 2009 at 11:10 AM, Mark Vigeant <
> mark.vigeant@riskmetrics.com
> >> wrote:
> >
> >> Hey Everyone-
> >>
> >> I've been playing around with Hadoop and Hbase for a while and I noticed
> >> that when running a program to upload data into an HTable I saw the
> output:
> >>
> >> INFO mapred.MapTask: io.sort.mb = 100
> >>
> >> Which is the default value, but in the mapred configuration on all
> machines
> >> in my cluster I set this value to 250. Could it be that my program is
> not
> >> accessing the configuration properly? Is that too large a value? Or is
> it
> >> most likely just a foolish syntax error on my part?
> >>
> >> Thank you very much, all input is appreciated.
> >>
> >> Mark Vigeant
> >> RiskMetrics Group, Inc.
> >>
> >>
> >> This email message and any attachments are for the sole use of the
> intended
> >> recipients and may contain proprietary and/or confidential information
> which
> >> may be privileged or otherwise protected from disclosure. Any
> unauthorized
> >> review, use, disclosure or distribution is prohibited. If you are not an
> >> intended recipient, please contact the sender by reply email and destroy
> the
> >> original message and any copies of the message as well as any
> attachments to
> >> the original message.
> >>
> >
>
> This email message and any attachments are for the sole use of the intended
> recipients and may contain proprietary and/or confidential information which
> may be privileged or otherwise protected from disclosure. Any unauthorized
> review, use, disclosure or distribution is prohibited. If you are not an
> intended recipient, please contact the sender by reply email and destroy the
> original message and any copies of the message as well as any attachments to
> the original message.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message