Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cassandra-user@incubator.apache.org
Received-SPF: pass (nike.apache.org: domain of edmond@ooyala.com designates
 209.85.160.41 as permitted sender)
MIME-Version: 1.0
From: Edmond Lau <edmond@ooyala.com>
Date: Thu, 29 Oct 2009 12:18:26 -0700
Message-ID: <f05b8c590910291218x32124438y27ed6dc53018f92e@mail.gmail.com>
Subject: can't write with consistency level of one after some nodes fail
To: cassandra-user@incubator.apache.org
Content-Type: text/plain; charset=ISO-8859-1

I have a freshly started 3-node cluster with a replication factor of
2.  If I take down two nodes, I can no longer do any writes, even with
a consistency level of one.  I tried on a variety of keys to ensure
that I'd get at least one where the live node was responsible for one
of the replicas.  I have not yet tried on trunk.  On cassandra 0.4.1,
I get an UnavailableException:

DEBUG [pool-1-thread-1] 2009-10-29 18:53:24,371 CassandraServer.java
(line 408) insert
 WARN [pool-1-thread-1] 2009-10-29 18:53:24,388
AbstractReplicationStrategy.java (line 135) Unable to find a live
Endpoint we might be out of live nodes , This is dangerous !!!!
ERROR [pool-1-thread-1] 2009-10-29 18:53:24,390 StorageProxy.java
(line 179) error writing key 1
UnavailableException()
        at org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:156)
        at org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:468)
        at org.apache.cassandra.service.CassandraServer.insert(CassandraServer.java:421)
        at org.apache.cassandra.service.Cassandra$Processor$insert.process(Cassandra.java:824)
        at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:627)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

Moreover, if I bring another of the nodes back up, do some writes,
take it back down again, and try to do writes with a consistency level
of one, I get an ApplicationException with the error text "unknown
result".  There's nothing in the debug logs about this new error:

DEBUG [pool-1-thread-7] 2009-10-29 19:08:26,411 CassandraServer.java
(line 258) get
DEBUG [pool-1-thread-7] 2009-10-29 19:08:26,411 CassandraServer.java
(line 307) multiget
DEBUG [pool-1-thread-7] 2009-10-29 19:08:26,413 StorageProxy.java
(line 239) weakreadlocal reading
SliceByNamesReadCommand(table='Keyspace1', key='3',
columnParent='QueryPath(columnFamilyName='Standard1',
superColumnName='null', columnName='null')', columns=[[118, 97, 108,
117, 101],])

I would've instead expected the node to accept the write and then have
the key repaired on subsequent reads when the other nodes get back up.

Along the same lines, how does Cassandra handle network partitioning
where 2 writes for the same keys hit 2 different partitions, neither
of which are able to form a quorum?  Dynamo maintained version vectors
and put the burden on the client to resolve conflicts, but there's no
similar interface in the thrift api.

Edmond