incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Single Node Cassandra Installation
Date Mon, 19 Mar 2012 09:34:01 GMT
> Even more: if you enable read repair the chances of having bad writes decreases for any
further reads. This will make your cluster become faster consistent again after some failure.
Under 1.0 the default RR probability was reduced to 10%. Because Hinted Handoff  was changed
to also store hints for nodes that fail to respond to a write. Previously  it only storied
hints for nodes that were down when the request started.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 18/03/2012, at 1:48 AM, R. Verlangen wrote:

> " By default Cassandra tries to write to both nodes, always. Writes will only fail (on
a node) if it is down, and even then hinted handoff will attempt to keep both nodes in sync
when the troubled node comes back up. The point of having two nodes is to have read and write
availability in the face of transient failure. "
> 
> Even more: if you enable read repair the chances of having bad writes decreases for any
further reads. This will make your cluster become faster consistent again after some failure.
> 
> Also consider to use different CL's for different operations. E.g. the Twitter timeline
can miss some records, however if you would want to display my bank account I would prefer
to see the right thing: or a nice error message. 
> 
> 2012/3/16 Ben Coverston <ben.coverston@datastax.com>
> Doing reads and writes at CL=1 with RF=2 N=2 does not imply that the reads will be inconsistent.
It's more complicated than the simple counting of blocked replicas. It is easy to support
the notion that it will be largely consistent, in fact very consistent for most use cases.
> 
> By default Cassandra tries to write to both nodes, always. Writes will only fail (on
a node) if it is down, and even then hinted handoff will attempt to keep both nodes in sync
when the troubled node comes back up. The point of having two nodes is to have read and write
availability in the face of transient failure.
> 
> If you are interested there is a good exposition of what 'consistency' means in a system
like Cassandra from the link below[1].
> 
> [1]
> http://www.eecs.berkeley.edu/~pbailis/projects/pbs/
> 
> 
> On Fri, Mar 16, 2012 at 6:50 AM, Thomas van Neerijnen <tom@bossastudios.com> wrote:
> You'll need to either read or write at at least quorum to get consistent data from the
cluster so you may as well do both.
> Now that you mention it, I was wrong about downtime, with a two node cluster reads or
writes at quorum will mean both nodes need to be online. Perhaps you could have an emergency
switch in your application which flips to consistency of 1 if one of your Cassandra servers
goes down? Just make sure it's set back to quorum when the second one returns or again you
could end up with inconsistent data.
> 
> 
> On Fri, Mar 16, 2012 at 2:04 AM, Drew Kutcharian <drew@venarc.com> wrote:
> Thanks for the comments, I guess I will end up doing a 2 node cluster with replica count
2 and read consistency 1.
> 
> -- Drew
> 
> 
> 
> On Mar 15, 2012, at 4:20 PM, Thomas van Neerijnen wrote:
> 
>> So long as data loss and downtime are acceptable risks a one node cluster is fine.
>> Personally this is usually only acceptable on my workstation, even my dev environment
is redundant, because servers fail, usually when you least want them to, like for example
when you've decided to save costs by waiting before implementing redundancy. Could a failure
end up costing you more than you've saved? I'd rather get cheaper servers (maybe even used
off ebay??) so I could have at least two of them.
>> 
>> If you do go with a one node solution, altho I haven't tried it myself Priam looks
like a good place to start for backups, otherwise roll your own with incremental snapshotting
turned on and a watch on the snapshot directory. Storage on something like S3 or Cloud Files
is very cheap so there's no good excuse for no backups.
>> 
>> On Thu, Mar 15, 2012 at 7:12 PM, R. Verlangen <robin@us2.nl> wrote:
>> Hi Drew,
>> 
>> One other disadvantage is the lack of "consistency level" and "replication". Both
ware part of the high availability / redundancy. So you would really need to backup your single-node-"cluster"
to some other external location.
>> 
>> Good luck!
>> 
>> 
>> 2012/3/15 Drew Kutcharian <drew@venarc.com>
>> Hi,
>> 
>> We are working on a project that initially is going to have very little data, but
we would like to use Cassandra to ease the future scalability. Due to budget constraints,
we were thinking to run a single node Cassandra for now and then add more nodes as required.
>> 
>> I was wondering if it is recommended to run a single node cassandra in production?
Are there any other issues besides lack of high availability?
>> 
>> Thanks,
>> 
>> Drew
>> 
>> 
>> 
> 
> 
> 
> 
> 
> -- 
> Ben Coverston
> DataStax -- The Apache Cassandra Company
> 
> 


Mime
View raw message