cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Cipar <jci...@cmu.edu>
Subject Re: Consistency model
Date Sun, 17 Apr 2011 01:35:20 GMT
Here it is.  There is some setup code and global variable definitions that I left out of the
previous code, but they are pretty similar to the setup code here.

    import pycassa
    import random
    import time

    consistency_level = pycassa.cassandra.ttypes.ConsistencyLevel.QUORUM
    duration = 600
    sleeptime = 0.0
    hostlist = 'worker-hostlist'

    def read_servers(fn):
        f = open(fn)
        servers = []
        for line in f:
            servers.append(line.strip())
        f.close()
        return servers

    servers = read_servers(hostlist)
    start_time = time.time()
    seqnum = -1
    timestamp = 0

    while time.time() < start_time + duration:
        target_server = random.sample(servers, 1)[0]
        target_server = '%s:9160'%target_server

        try:
            pool = pycassa.connect('Keyspace1', [target_server])
            cf = pycassa.ColumnFamily(pool, 'Standard1')
            row = cf.get('foo', read_consistency_level=consistency_level)
            pool.dispose()
        except:
            time.sleep(sleeptime)
            continue

        sq = int(row['seqnum'])
        ts = float(row['timestamp'])

        if sq < seqnum:
            print 'Row changed: %i %f -> %i %f'%(seqnum, timestamp, sq, ts)
        seqnum = sq
        timestamp = ts

        if sleeptime > 0.0:
            time.sleep(sleeptime)




On Apr 16, 2011, at 5:20 PM, Tyler Hobbs wrote:

> James,
> 
> Would you mind sharing your reader process code as well?
> 
> On Fri, Apr 15, 2011 at 1:14 PM, James Cipar <jcipar@cmu.edu> wrote:
> I've been experimenting with the consistency model of Cassandra, and I found something
that seems a bit unexpected.  In my experiment, I have 2 processes, a reader and a writer,
each accessing a Cassandra cluster with a replication factor greater than 1.  In addition,
sometimes I generate background traffic to simulate a busy cluster by uploading a large data
file to another table.
> 
> The writer executes a loop where it writes a single row that contains just an sequentially
increasing sequence number and a timestamp.  In python this looks something like:
> 
>    while time.time() < start_time + duration:
>        target_server = random.sample(servers, 1)[0]
>        target_server = '%s:9160'%target_server
> 
>        row = {'seqnum':str(seqnum), 'timestamp':str(time.time())}
>        seqnum += 1
>        # print 'uploading to server %s, %s'%(target_server, row)
> 
>        pool = pycassa.connect('Keyspace1', [target_server])
>        cf = pycassa.ColumnFamily(pool, 'Standard1')
>        cf.insert('foo', row, write_consistency_level=consistency_level)
>        pool.dispose()
> 
>        if sleeptime > 0.0:
>            time.sleep(sleeptime)
> 
> 
> The reader simply executes a loop reading this row and reporting whenever a sequence
number is *less* than the previous sequence number.  As expected, with consistency_level=ConsistencyLevel.ONE
there are many inconsistencies, especially with a high replication factor.
> 
> What is unexpected is that I still detect inconsistencies when it is set at ConsistencyLevel.QUORUM.
 This is unexpected because the documentation seems to imply that QUORUM will give consistent
results.  With background traffic the average difference in timestamps was 0.6s, and the maximum
was >3.5s.  This means that a client sees a version of the row, and can subsequently see
another version of the row that is 3.5s older than the previous.
> 
> What I imagine is happening is this, but I'd like someone who knows that they're talking
about to tell me if it's actually the case:
> 
> I think Cassandra is not using an atomic commit protocol to commit to the quorum of servers
chosen when the write is made.  This means that at some point in the middle of the write,
some subset of the quorum have seen the write, while others have not.  At this time, there
is a quorum of servers that have not seen the update, so depending on which quorum the client
reads from, it may or may not see the update.
> 
> Of course, I understand that the client is not *choosing* a bad quorum to read from,
it is just the first `q` servers to respond, but in this case it is effectively random and
sometimes an bad quorum is "chosen".
> 
> Does anyone have any other insight into what is going on here?
> 
> 
> 
> -- 
> Tyler Hobbs
> Software Engineer, DataStax
> Maintainer of the pycassa Cassandra Python client library
> 


Mime
View raw message