cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Janne Jalkanen <janne.jalka...@ecyrd.com>
Subject Re: Setting up Cassandra to store on a specific node and not replicate
Date Wed, 18 Dec 2013 10:27:17 GMT

This may be hard because the coordinator could store hinted handoff (HH) data on disk. You
could turn HH off and have RF=1 to keep data on a single instance, but you would be likely
to lose data if you had any problems with your instances… Also you would need to tweak the
memtable flushing so that it goes to disk more often than the ten seconds which is the default.
Or lose data. You will also have an "interesting" time scaling your cluster and would have
to plan for that in your custom database.

Essentially you want to turn off all the features which make Cassandra a robust product ;-).
Without knowing your requirements more precisely, I'd be inclined to recommend manually sharding
on MariaDB or Postgres instances instead, or use their underlying storage engines directly
(e.g. InnoDB), if you're just looking for a key-value store.

/Janne

On 18 Dec 2013, at 11:20, Colin MacDonald <colin.macdonald@sas.com> wrote:

> Ahoy the list.  I am evaluating Cassandra in the context of using it as a storage back
end for the Titan graph database.
>  
> We’ll have several nodes in the cluster.  However, one of our requirements is that
data has to be loaded into and stored on a specific node and only on that node.  Also, it
cannot be replicated around the system, at least not stored persistently on disk – we will
of course make copies in memory and on the wire as we access remote notes.  These requirements
are non-negotiable.
>  
> We understand that this is essentially the opposite of what Cassandra is designed for,
and that we’re missing all the scalability and robustness, but is it technically possible?
>  
> First, I would need to create a custom partitioner – is there any tutorial on that?
 I see a few “you don’t need” to threads, but I do.
>  
> Second, how easy is it to have Cassandra not replicate data between nodes in a cluster?
 I’m not seeing an obvious configuration option for that, presumably because it obviates
much of the point of using Cassandra, but again, we’re working within some rather unfortunate
constraints.
>  
> Any hints or suggestions would be most gratefully received.
>  
> Kind regards,
>  
> -Colin MacDonald-
>  


Mime
View raw message