accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Hughes <>
Subject Re: Small cluster Hadoop/Accumulo process placement recommendation
Date Wed, 17 Apr 2013 03:02:21 GMT
Hi Terry,

>From my limited experience, I'd say you have enough to get started.  I've
set up a small cloud with just 6 nodes on AWS:  One
namenode/tasktracker/Cloudbase (Accumulo when it was first released)
machine, one zookeeper, and 4 datanode/jobtracker/tabletserver nodes.
(Yes, I believe you should be able to run the Accumulo Master on the Hadoop

The cloud was set up to test out running things on AWS, so I didn't do
anything terribly data intensive on it.  The worst issue I had was that
MapReduce jobs needed more than a gig of memory, so early on I had to
switch from medium size machines (with 4 gigs of ram) to large instances (8
gigs of ram).

Thoughts:  You should have enough to get started.  If you don't know where
your limits are, you'll find them and then you can work to address them.
Recommendations:  If and when you're ready to optimize your project,
consider how your data is stored in Accumulo.  NoSQL is new enough that I
don't think the community has all the answers for particular use cases.



On Tue, Apr 16, 2013 at 8:07 PM, Terry P. <> wrote:

> Greetings everyone,
> I'm learning a lot from reading all of the great questions and informative
> answers here on the Accumulo mailing list.  Thus far I haven't come across
> a question similar to mine, nor a basic recommendation so here goes:
> I'm looking for recommendations on process / component placement for a
> small Accumulo cluster serving a prototype.  It will be scaled later, but
> for now I'm looking at a cluster with just 8 nodes.  My current thought
> process has led me to the following server / process placement and I'm
> interested in feedback on it.
> zoo1, zoo2, zoo3: ZooKeeper servers, dual proc, 4 GB RAM (small servers)
> namenode, secnamenode: 16GB RAM, 4 cores each, with local and remote
> locations to store name data
> *** Can I place the Accumulo Master on the NameNode or Secondary NameNode?
> ***
> accdata1, accdata2, accdata3: 16GB RAM, 4 cores each, serving as HDFS
> DataNodes and Accumulo TabletServers each with 4 2TB JBOD disks for HDFS
> I'm thinking having the Accumulo Master on the NameNode will simplify
> cluster startup.  Thoughts?  Recommendations?
> Many thanks in advance,
> Terry

View raw message