accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Vines <john.w.vi...@ugov.gov>
Subject Re: memory usage & process distribution
Date Mon, 23 Jul 2012 18:32:55 GMT
I was just referring to

mapred.map.tasks
mapred.reduce.tasks
mapred.child.java.opts

Which set the number of max map slots and reduce slots per node, and then
how much memory they can use.

John

On Mon, Jul 23, 2012 at 1:20 PM, Miguel Pereira
<miguelapereira1@gmail.com>wrote:

> John,
>
> For configuring map reduce do you mean adding the
>
> mapred.local.dir
> mapred.system.dir
> mapred.temp.dir
>
> properties to the mapred-site.xml ?
>
>
>
> On Mon, Jul 23, 2012 at 11:33 AM, John Vines <john.w.vines@ugov.gov>
> wrote:
>
> > On Mon, Jul 23, 2012 at 11:21 AM, Miguel Pereira
> > <miguelapereira1@gmail.com>wrote:
> >
> > > Hey guys,
> > >
> > > I want to set up a realistic production cluster on Amazon's EC2 and I
> am
> > > trying to decide 2 things.
> > >
> > >
> > >    -  Memory usage
> > >
> > > If I use one of the example configuration files, say the 512MB does
> that
> > > mean that all Accumulo processes will use up a total of 512MB? At least
> > > this appears to be the case when looking at the accumulo-env.sh
> > > This will determine weather I use a small or large instance.
> > >
> > >
> > >
> > Yes, it sets it up so all of the Accumulo processes have a footprint no
> > bigger than 512MB. Mind you, we only have one configuration that is set
> up
> > for things in a distributed fashion, which is 3GB. So if you're running
> > multiple nodes, you can up some of the configurations for a larger
> > footprint because you won't be running every process on every node.
> >
> >
> > >    - Process Distribution
> > >
> > > Is this a standard configuration? I will start off with a small # of
> > worker
> > > nodes ( 3-4 ) & hope to use my local machine as a "monitor" for the
> > > accumulo & ganglia web UI's in order to avoid ssh -X latency.
> > >
> > > [ Name Node ] Name Node, Gmond
> > > [ Secondary NN ] Secondary Name Node, Gmond
> > > [ Job Tracker ] JobTracker, Gmond
> > > [ Zookeeper ] Zookeeper
> > > [ Accumulo Master ] Master, Tracer, Garbage Collector, Gmond, Jmxtrans
> > > [ Monitor ] Monitor, Gmetad, Gweb
> > > [ Worker Node ] DataNode, Tasktracker, TabletServer, Logger, Gmond,
> > > Jmxtrans
> > >
> > > That looks good to me. Just make sure you configure your map reduce to
> > that child memory * (reduce slots + map slots) aren't enough to cause
> > swapping.
> >
> > >
> > > Thanks,
> > >
> > > Miguel
> > >
> >
> > John
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message