cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "bjc (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-781) in a cluster, get_range_slice() does not return all the keys it should
Date Fri, 12 Feb 2010 04:19:27 GMT


bjc commented on CASSANDRA-781:

Ahh!! Right you are, I was using RP instead of OPP. Ok, but now here is another problem: if
I insert 10 keys and then ask for them back, it works. However, if I insert 10 more and do
a range scan with start="", I don't get the lowest key:

In [17]: run
insert aa3cf33059d64dac8aef4a250bc5ea9c
insert a7dbda2925eb4b439c89cd71d56b5113
insert f28e92d5e5554857940c9d3386bf4121
insert 2b8ec460e7d346cbaf3dcb00e1aaaf91
insert 7792c98f0c3948299c622c73b906df66
insert 37a8bfdb69b642ba8e96d33b060f789d
insert 38c18f5d3d2c46cbb4e44b603a8acdbd
insert bef8104ea9184abaa3f0788ef7b2e0db
insert 934fe04d30cc4a96b1f1a9e7930316b8
insert 1d3413e88af946349f148c4fafeb6bf7
result 1d3413e88af946349f148c4fafeb6bf7
result 2b8ec460e7d346cbaf3dcb00e1aaaf91
result 37a8bfdb69b642ba8e96d33b060f789d
result 38c18f5d3d2c46cbb4e44b603a8acdbd
result 7792c98f0c3948299c622c73b906df66
result a7dbda2925eb4b439c89cd71d56b5113
result aa3cf33059d64dac8aef4a250bc5ea9c
result bef8104ea9184abaa3f0788ef7b2e0db
result f28e92d5e5554857940c9d3386bf4121
start f28e92d5e5554857940c9d3386bf4121
result f28e92d5e5554857940c9d3386bf4121

In [18]: run
insert 4eb0300540ec4b4083fbaf33741fc4a5
insert 12b43ba967314b369faff7e59902d6c2
insert 5b4b729676bc4ea2816620c3b6dff080
insert cf2fda1b11d843f1ae7949dbbb7d179d
insert c9d0cf4a1e9a48caa143afd2b0268f70
insert 9a044cff59b940d5bfbeffd58b01ee8e
insert d2ee042f0b0b4f7ea86e6e2c0dfdcfdd
insert d239aee577684c27afea2fe7e3361bdf
insert 706b20976f974de49bda61d55b9c2a63
insert 36177455bc3b4469b7e6f51897c9f3ba
result a7dbda2925eb4b439c89cd71d56b5113
result aa3cf33059d64dac8aef4a250bc5ea9c
result bef8104ea9184abaa3f0788ef7b2e0db
result c9d0cf4a1e9a48caa143afd2b0268f70
result cf2fda1b11d843f1ae7949dbbb7d179d
start cf2fda1b11d843f1ae7949dbbb7d179d
result cf2fda1b11d843f1ae7949dbbb7d179d
result d239aee577684c27afea2fe7e3361bdf
result d2ee042f0b0b4f7ea86e6e2c0dfdcfdd
result f28e92d5e5554857940c9d3386bf4121

In [19]: 

See what I mean? In the first run I inserted "1d3413e88af946349f148c4fafeb6bf7" but the second
range scan I get "a7dbda2925eb4b439c89cd71d56b5113" back first, even when I set start="".
Could this somehow be my fault too?

Test follows:

import uuid

from thrift import Thrift
from thrift.transport import TTransport
from thrift.transport import TSocket
from thrift.protocol.TBinaryProtocol import TBinaryProtocolAccelerated

import sys

from cassandra import Cassandra
from cassandra.ttypes import *

socket = TSocket.TSocket("", 9160)
transport = TTransport.TBufferedTransport(socket)
protocol = TBinaryProtocol.TBinaryProtocolAccelerated(transport)
client = Cassandra.Client(protocol)

ks = "Keyspace1"
cf = "Super1"
path = ColumnPath(cf, "foo", "is")
value = "cool"

for i in xrange(10):
    key = uuid.uuid4().hex
    print "insert", key
    client.insert(ks, key, path, value, 0, ConsistencyLevel.ONE)

parent = ColumnParent(column_family=cf)
slice_range = SliceRange(start="key", finish="key")
predicate = SlicePredicate(slice_range=slice_range)

result = client.get_range_slice(ks, parent, predicate, "", "", 5, ConsistencyLevel.ONE)
for row in result:
    print "result", row.key

start = result[-1].key

print "start", start

result = client.get_range_slice(ks, parent, predicate, start, "", 10, ConsistencyLevel.ONE)
for row in result:
    print "result", row.key

> in a cluster, get_range_slice() does not return all the keys it should
> ----------------------------------------------------------------------
>                 Key: CASSANDRA-781
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.5
>         Environment: Debian 5 lenny on EC2, Gentoo linux, Windows XP
>            Reporter: bjc
>            Assignee: Jonathan Ellis
>             Fix For: 0.5, 0.6
>         Attachments: 781.txt
> get_range_slice() does not return the same set of keys as get_key_range() in 0.5.0 final.
> I posted a program to reproduce the behavior:
> Apparently, you must have more than one node to get the behavior. Also, it may depend
on the locations of the nodes on the ring.. I.e., if you don't generate enough keys randomly,
then by chance they could all fall on the same host and you might not see the behavior, although
I was able to get it to happen using only 2 nodes and 10 keys.
> Here are the other emails describing the issue:

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message