cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "B. Todd Burruss" <bburr...@real.com>
Subject Re: something bizzare occured
Date Fri, 15 Jan 2010 23:43:54 GMT
so i changed to QUORUM and retested.  "puts" again work as expected when
a node is down.  thx!

however, the response time for puts went from about 5ms to 400ms because
i took 1 of the 5 nodes out.  ROW-MUTATION-STAGE pendings jumped into to
100's on one of the remaining nodes and the WriteLatency for the column
family on this node also went thru the roof.

i added the server back and the performance immediately went back to the
way it was.

is cassandra trying to constantly connect to the downed server?  or what
might be causing the performance to drop so dramatically?

On Fri, 2010-01-15 at 13:20 -0800, Jonathan Ellis wrote:
> right
> 
> On Fri, Jan 15, 2010 at 3:13 PM, B. Todd Burruss <bburruss@real.com> wrote:
> > so with 5 node cluster, R=W=Q and RF=3, i can only loose one consecutive
> > node on the consistency "ring", correct?
> >
> >
> > On Fri, 2010-01-15 at 12:54 -0800, Jonathan Ellis wrote:
> >> it has to do w/ consistency guarantees:
> >> http://wiki.apache.org/cassandra/HintedHandoff
> >>
> >> use quorum reads and writes instead of ALL on writes if you need both
> >> consistency and availability
> >>
> >> -Jonathan
> >>
> >> On Fri, Jan 15, 2010 at 2:50 PM, B. Todd Burruss <bburruss@real.com> wrote:
> >> > that makes sense, but i have had trouble understanding why
> >> > hinted-handoff doesn't take care of it?  if not, how many nodes would i
> >> > need to prevent this?
> >> >
> >> > thx
> >> >
> >> >
> >> > On Fri, 2010-01-15 at 12:43 -0800, Jonathan Ellis wrote:
> >> >> On Fri, Jan 15, 2010 at 2:39 PM, B. Todd Burruss <bburruss@real.com>
wrote:
> >> >> > i'm trying to understand why cassandra 0.5 RC3 is behaving like
it is.  I
> >> >> > have a 5 node cluster, RF=3, W=ALL, R=1.  all is well if all the
nodes are
> >> >> > running.  if i remove a node, then "puts" fail - doesn't matter
which host
> >> >> > i'm connected to.  if i restart the node, then all goes back to
normal
> >> >> > operation.
> >> >> >
> >> >> > the obvious misunderstanding to me is that i have set W=ALL. 
As I
> >> >> > understand it, this should mean that the data will be written
to ALL the
> >> >> > replicas (RF=3) not all the nodes in the cluster.
> >> >>
> >> >> Right, but if you take one of the nodes down then it is going to be
> >> >> one of the three replicas for 3/5 of your keys.  (Could be more
> >> >> depending on your partitioner and whether you balanced your nodes.)
> >> >>
> >> >> > i also see the following message upon restarting the node that
i stopped -
> >> >> > is it a problem?
> >> >>
> >> >> No.
> >> >>
> >> >> -Jonathan
> >> >
> >> >
> >> >
> >
> >
> >



Mime
View raw message