incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bjorn Borud <>
Subject Re: Creating two instances in code
Date Tue, 17 Aug 2010 13:31:28 GMT
Gary Dusbabek <> writes:

> I looked into doing this when I was first learning the code and had an
> experience simliar to yours.  At the time there wasn't much interest
> in seeing it through to fruition, but maybe times have changed.

any lack of interest in solving these problems just means that people
haven't stumbled on these problems yet :-)
...but eventually they will (and people like Ran Tavory and the Hector
team have already stumbled across these hurdles and had to devote time
to creating some workarounds).

> If I were to attempt it again I would do it in this error:
> 1.  Make the config customizable.

Would it be good enough if you had a CassandraConfig object and some
ways to create it?  Either directly or through:

  CassandraConfig config = CassandraConfig.parseFile(...);

and then some:

  Cassandra cassandra = Cassandra.createInstance(config);

or even

  Cassandra cassandra = new Cassandra(config);

> 2.  Make the services re-entrant (You should be able to start, stop,
> then start again without problems).

you mean restart an instance or be able to throw away your instance and
create a new one?  for me, being able to restart a stopped instance
isn't really that important because it would work fine for me to create
a new instance (possibly with the same config, using the same files/dirs
and ports).  

you may have good reasons to be able to restart a stopped Cassandra
instance though.  (But I suspect we more or less want the same thing).

> 3.  Get rid of the singletons.  This will involve coming up with a
> smart way to couple instances of the services with each other.

indeed.  but I hope nobody falls for the temptation of introducing
Spring or something similar to do the wiring in the Cassandra
code. (what people do in their own projects is their problem, but
Cassandra should not require you to adopt additional mamoth frameworks).

> 4.  Integrate the storage port into how we canonically identify a node
> (its just hostname now).

hmm, I see your point, but I am not sure I understand the consequences

> 5.  While you're at it, figure out how to get JMX to bind to something
> other than  (I hear it is possible, see

I have limited experience with JMX so I'll pass on commenting on this.

>> there are other valid reasons for wanting to embed Cassandra besides
>> unit testing.  for instance, if you are writing an application that
>> depends on Cassandra and you want the option of packaging it as a single
>> binary for single node experimentation, development and demo purposes.
> I'd kind of like to see this too, although I admit that from the
> pragmatic standpoint of running a Cassandra server, it represents a
> whole lot of change for what amounts to very little tangible benefit.

while the benefit may be hard to articulate, I think it is significant.
any time you can embed a "server" in your binary you can make life a lot
easier for casual users and for testing.

almost all server projects I have done in the past 7-8 years have been
like this:  I make it possible to embed the server so that people can
build and distribute prototypes or they can use the exact same binary to
either use an external (distributed) instance or just create an internal
instance for simpler use-cases (by config).

compare to Hudson.  it is distributed as a WAR so you can load it into
your web server.  but for most people, they just want it up and running
with as little hassle as possible on a single node, so being able to
fire it up from the command line, and rely on the embedded web server is
very attractive compared to fooling around with Jetty, Tomcat or worse.
if Hudson had required me to manage a number of services that I need to
manually set up and manage, I would probably not have bothered using it.

(not sure if that example is very clear, but hey... :-)

> From a development standpoint, the biggest benefit I see it would that
> we could write unit tests for small clusters that run on a single
> node.

yeah, it is critical for unit testing.  right now we are forced to do
testing in a rather clumsy fashion.  it is a big step backward from, for
instance, the way I do testing with Apache Derby (which has hairy
lifecycle management, but it is embeddable).

> One interesting thing that this would make possible is the ability to
> have a node with >1 tokens in a single JVM.  Useful, who knows?  But
> it is interesting because I think it would make Cassandra more elastic
> (and could theoretically help with the hot-node problem when using
> OPP).

(there are some usage scenarios using OSGi to run multiple Cassandra
instances in the same JVM that come to mind, but I haven't really given
this a lot of (any) detailed thought)


View raw message