cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Luciani <>
Subject Re: 2 questions DataStax Enterprise
Date Tue, 03 Apr 2012 17:11:08 GMT
Hi reply inline.

On Tue, Apr 3, 2012 at 12:18 PM, Alexandru Sicoe <> wrote:

> Hi guys,
>  I'm trying out DSE and looking for the best way to arrange the cluster. I
> have 9 nodes: 3 behind a gateway taking in writes from my collectors and 6
> outside the gateway that are supposed to take replicas from the other 3 and
> serve reads and analytics jobs.
> 1. Is it ok to run the 3 nodes as normal Cassandra nodes and run the other
> 6 nodes as analytics? Can I serve both real time reads and M/R jobs from
> the 6 nodes? How will these affect each other performancewise?

if you plan to use CFS heavily then it will affect performance of the other
nodes.  If you raise the RF of your column families then it should be fine
if you run mapreduce at CL=ONE

> I know that the way the system is supposed to be used is to separate
> analytics from real time queries. I've already explored a possible 3DC
> setup with Tyler in another message and it indeed works but I'm afraid it
> is too complex and would require me to send 2 replicas across the firewall
> which it can't handle very well at peak times, affecting other applications.
> 2. I started the cluster in the setup described in 1 (3 normal, 6
> analytics) and as soon as the Analytics nodes start up they start
> outputting this message:
> INFO [TASK-TRACKER-INIT] 2012-04-03 17:54:59,575 (line 629)
> Retrying connect to server: IP_OF_NORMAL_CASSANDRA_SEED_NODE:8012. Already
> tried 10 time(s).
> ....
> So it seems my analytics nodes are trying to contact the normal Cassandra
> seed node on port 8012 which I read is a "Hadoop Job Tracker client port".
> It doesn't seem like this is the normal behavior. Why is it getting
> confused? In the .yaml of each node I'm using endpoint_snitch:
> com.datastax.bdp.snitch.DseSimpleSnitch and putting in the Analytics seed
> node before the normal cassandra seed node in the seeds.

You can run dsetool movejt to move the jobtracker to one of the known
hadoop nodes.

> Cheers,
> Alex


View raw message