cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thibaut Britz <>
Subject Re: Question about consitency level & data propagation & eventually consistent
Date Thu, 11 Nov 2010 10:57:59 GMT

thanks for all the informative answers.

Since writing is much faster then reading, I assume that's it's faster to
write the data to 3 replicas and read from 1 instead of writing to 2 and
reading from at least 2. (especially if I execute the read operation
multiple times on the same key). I could then easily double my read

I would then like to do the following: Always write to all nodes which are
marked as Up. Then read from one repair. If one node would go down (hardware
failure/cassandra down) I would run the repair tool and fix the node, which
shouldn't happen very often. I can also deal with very small inconsitencies.

- What consitency level would I have to chose? All will fail if one node is
down, Quorum will only write to the quorom. I would need something that will
write to all nodes which are marked as UP.
- If I choose Quorum, what will happen to the remaining writes if the node
is marked as UP. Will they always be executed or can they be dropped (eg.
node doing compactation while the write happens?)
- To bring a node back to the system, I would run the repair command on the
node. Is there a way to do an offline repair (so I make sure that my
application won't read from this node). I guess chaning the port temporarely
will not work, since cassandra will communicate the node through the other
nodes to my client?


On Wed, Nov 10, 2010 at 5:52 PM, Jonathan Ellis <> wrote:

> On Wed, Nov 10, 2010 at 8:54 AM, Thibaut Britz
> <> wrote:
> > Assuming I'm reading and writing with consitency level 1 (one), read
> repair
> > turned off, I have a few questions about data propagation.
> > Data is being stored at consistency level 3.
> > 1) If all nodes are up:
> >  - Will all writes eventually reach all nodes (of the 3 nodes)?
> Yes.
> >  - What will be the maximal time until the last write reaches the last
> node
> Situation-dependent.  The important thing is that if you are writing
> at CL.ALL, it will be before the write is acked to the client.
> > 2) If one or two nodes are down
> > - As I understood it, one node will buffer the writes for the remaining
> > nodes.
> Yes: _after_ the failure detector recognizes them as down. This will
> take several seconds.
> > - If the nodes go up again: When will these writes be propagated
> When FD recognizes them as back up.
> > The best way would then be to run nodetool repair after the two nodes
> will
> > be available again. Is there a way to make the node not accept any
> > connections during that time until it is finished repairing? (eg throw
> the
> > Unavailableexception)
> No.  The way to prevent stale reads is to use an appropriate
> consistencylevel, not error-prone heuristics.  (For instance: what if
> the replica with the most recent data were itself down when the first
> node recovered and initiated repair?)
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support

View raw message