accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Parker, Matthew - IS" <>
Subject RE: Accumulo Configuration Question
Date Thu, 31 Jan 2013 16:49:44 GMT
I don't control the system, and the admins won't open up ports to the outside. I'm stuck with
putty access.

From: Jason Morris []
Sent: Thursday, January 31, 2013 11:44 AM
Subject: Re: Accumulo Configuration Question

Have you tried setting up a SOCKS proxy via SSH and pulling the status page that way?

On Thu, Jan 31, 2013 at 11:19 AM, Parker, Matthew - IS <<>>
I'm sort of flying blind. The cluster is on a headless environment, and I can only access
the system via putty at the command prompt. I've had to resort to using lynx to browse the
monitor page. Unfortunately, the graphs don't translate well when using a text-based browser.
Is there another way to get that info through the logs?

From: William Slacum [<>]
Sent: Thursday, January 31, 2013 10:52 AM
Subject: Re: Accumulo Configuration Question

This doesn't have much to do with your cluster set up, but what does the monitor say as your
jobs are nearing completion and things start failing? Are there hold times for the table(s)
you are writing to?

On Thu, Jan 31, 2013 at 10:19 AM, Parker, Matthew - IS <<>>

I'm new to Accumulo and I've been trying to come up with a good architecture for a 20 node
cluster. I have been running a map/reduce program, and it encounters issues when it comes
to running the Accumulo section of the code. Once the job's completion rate exceeds 93, it
starts dropping 10's of tasks during the process, because they eventually timeout. The completion
rate drops back down, but it the job eventually finishes. I have a suspicion it's due to the
way I have the system configured and I wanted to get some feedback as to what's the generally
preferred architecture when installing Accumulo?

Since you have the choice of installing hdfs, map/reduce, and tablet servers on any three,
the general guideline is to install two per machine (data node and table server, or data node
and map/reduce) as per the Hardware section in the Administration documentation.

Does that mean you have one large group of data nodes that's installed on the majority of
the cluster, or are they somehow split into two groups such that map/reduce & hdfs runs
on one set of nodes, and Accumulo tablet servers and hdfs uses another?

I was wondering whether people would comment on what a working configuration might look like?




This e-mail and any files transmitted with it may be proprietary and are intended solely for
the use of the individual or entity to whom they are addressed. If you have received this
e-mail in error please notify the sender. Please note that any views or opinions presented
in this e-mail are solely those of the author and do not necessarily represent those of Exelis
Inc. The recipient should check this e-mail and any attachments for the presence of viruses.
Exelis Inc. accepts no liability for any damage caused by any virus transmitted by this e-mail.

Jason Morris
TexelTek Inc.
308 Sentinel Drive
Suite 500
Annapolis Junction, MD  20701
Office: 301.880.7123 Ext. 6677

View raw message