Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of tyler@datastax.com designates
 74.125.82.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <AANLkTin+ZJtwbEQUXk-ZRHRQ7ZyXt31gft3ff=SQNX8D@mail.gmail.com>
References: <AANLkTimsfubAnkLc5GYWv68Ui5wP-4bhJ4UxzuDnjw3-@mail.gmail.com>
	<A3D043DA-DBC0-4EB9-812F-7F19A1828208@thelastpickle.com>
	<AANLkTin+ZJtwbEQUXk-ZRHRQ7ZyXt31gft3ff=SQNX8D@mail.gmail.com>
Date: Fri, 18 Feb 2011 00:59:36 -0600
Message-ID: <AANLkTi=gkNyyF3e-68Lx7Qj9ZvP5KCvEgQ2zgga1bFG6@mail.gmail.com>
Subject: Re: Inconsistent result in super range slice query (reversed order)
From: Tyler Hobbs <tyler@datastax.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=0015177fcec4a2100b049c890e4c

--0015177fcec4a2100b049c890e4c
Content-Type: text/plain; charset=ISO-8859-1

I'm unable to reproduce this in pycassa starting with a clean database.  Are
you doing anything else to these rows besides inserting them?

Here's the complete script I'm using below.  Could you confirm that this
causes problems for you?

- Tyler

=========

import sys
import pycassa

pool = pycassa.ConnectionPool('Keyspace1')
cf = pycassa.ColumnFamily(pool, 'Super1')

KEY = 'key'

columns = [
    "20031210020333/190209-20031210-4476807-s/"  , #0
    "20031210020333/190209-20031210-4476807-s/0" , #1
    "20031210021940/190209-20031210-4476883-s/"  , #2
    "20031210021940/190209-20031210-4476883-s/0" , #3
    "20031210022059/190209-20031210-4476885-s/"  , #4
    "20031210022059/190209-20031210-4476885-s/0" , #5
    # <--Problem_around_here.
    "20031210022154/190209-20031210-4476888-s/"  , #6
    "20031210022154/190209-20031210-4476888-s/0"   #7
]

for supercolumn in columns:
    cf.insert(KEY, {supercolumn: {'subcol': 'subval', 'subcol2': 'subval'}})

def get_cols(start_date, end_date, reversed):
    for key, cols in cf.get_range(start = KEY,
                                  finish = KEY,
                                  column_reversed=reversed,
                                  column_count=10000,
                                  column_start=start_date,
                                  column_finish=end_date):
        for supercol, subcols in cols.iteritems():
            print "col='%s' \tlen = %d" % (supercol, len(subcols))

start = 0
for end in [0,3,5,7]:
    print "\nstart %d, end %d + 'z'" % (start, end)
    get_cols(columns[start], columns[end] + 'z', False)

end = 0
for start in [0, 3, 5, 7]:
    print "\nstart %d + 'z', end %d (reversed)" % (start, end)
    get_cols(columns[end], columns[start] + 'z', False)


On Thu, Feb 17, 2011 at 11:09 PM, Shotaro Kamio <kamioshot@gmail.com> wrote:

> Hi Aaron,
>
> Range slice means get_range_slices() in thrift api,
> createSuperSliceQuery in hector, get_range() in pycassa. The example
> code in pycassa is attached below.
>
> The problem is a little bit complicated to explain. I'll try to
> describe in examples.
> Here are 8 super column names which exist in the specific key. The
> list is forward order.
>
> #0: "20031210020333/190209-20031210-4476807-s/"
> #1: "20031210020333/190209-20031210-4476807-s/0"
> #2: "20031210021940/190209-20031210-4476883-s/"
> #3: "20031210021940/190209-20031210-4476883-s/0"
> #4: "20031210022059/190209-20031210-4476885-s/"
> #5: "20031210022059/190209-20031210-4476885-s/0"  <-- Problem around here.
> #6: "20031210022154/190209-20031210-4476888-s/"
> #7: "20031210022154/190209-20031210-4476888-s/0"
>
> There is no problem if I use the super column names exist on the key.
>
> * Range from #0 to #3 in forward order -> OK
> * Range from #0 to #5 in forward order -> OK
> * Range from #0 to #7 in forward order -> OK
>
> * Range from #7 to #0 in reverse order -> OK
> * Range from #5 to #0 in reverse order -> OK
> * Range from #3 to #0 in reverse order -> OK
>
>
> Because I want to scan orders in a certain range, however, I use
> column names which added character "z" (higher than anything in
> order_id). Those column names are listed below as #1z, #3z, #5z and
> #7z. Note that these super column names don't really exist on the key.
> (#4+ is a column name to locate between #4 and #5)
>
> #0 : "20031210020333/190209-20031210-4476807-s/"
> #1 : "20031210020333/190209-20031210-4476807-s/0"
> #1z: "20031210020333/190209-20031210-4476807-s/z" (don't exist)
> #2 : "20031210021940/190209-20031210-4476883-s/"
> #3 : "20031210021940/190209-20031210-4476883-s/0"
> #3z: "20031210021940/190209-20031210-4476883-s/z" (don't exist)
> #4 : "20031210022059/190209-20031210-4476885-s/"
> #4+: "20031210022059/190209-20031210-4476885-s/+" (don't exist)
> #5 : "20031210022059/190209-20031210-4476885-s/0"  <-- Problem around here.
> #5z: "20031210022059/190209-20031210-4476885-s/z" (don't exist)
> #6 : "20031210022154/190209-20031210-4476888-s/"
> #7 : "20031210022154/190209-20031210-4476888-s/0"
> #7z: "20031210022154/190209-20031210-4476888-s/z" (don't exist)
>
> Then, try to range slice them.
>
> * Range from #0 to #3z in forward order -> OK
> * Range from #0 to #4+ in forward order -> OK
> * Range from #0 to #5z in forward order -> OK
> * Range from #0 to #7z in forward order -> OK
>
> * Range from #7z to #0 in reverse order -> OK
> * Range from #5z to #0 in reverse order -> FAIL (no result)
> * Range from #4+ to #0 in reverse order -> OK
> * Range from #3z to #0 in reverse order -> OK
>
> The problem happens in this case. No error or warning is shown in cassandra
> log.
>
> Also, I tried dumping data into json via sstable2json and restored it
> with json2sstable. But the same problem occurs.
>
>
> The code I used for the test is something like this.
> ----------------------
> client = pycassa.connect(KEYSPACE, [ CASSANDRA_HOST ])
> cf = pycassa.ColumnFamily(client, COLUMN_FAMILY)
>
> columns = [
> "20031210020333/190209-20031210-4476807-s/"  , #0
> "20031210020333/190209-20031210-4476807-s/0" , #1
> "20031210021940/190209-20031210-4476883-s/"  , #2
> "20031210021940/190209-20031210-4476883-s/0" , #3
> "20031210022059/190209-20031210-4476885-s/"  , #4
> "20031210022059/190209-20031210-4476885-s/0" , #5
> # <--Problem_around_here.
> "20031210022154/190209-20031210-4476888-s/"  , #6
> "20031210022154/190209-20031210-4476888-s/0"   #7
> ]
>
> reversed = False
> if len(sys.argv) > 1:
>    # use reversed order if "-r" option is given. "-f" or others for
> forward order, no option will list all column names.
>    reversed = (sys.argv[1] == '-r')
>
>    start_date = columns[0]
>    end_date  = columns[7] + "z" # add "z" to make problem.
>
>    if reversed:
>        temp = start_date
>        start_date = end_date
>        end_date   = temp
>        pass
> else:
>    start_date = end_date = ''
>    pass
>
> print "start_date =", start_date, "end_date =", end_date, "reversed =
> ", reversed
>
> for it in cf.get_range(start = A_KEY, finish = A_KEY,
> column_reversed=reversed, column_count=10000, column_start=start_date,
> column_finish=end_date):
>
>    for d in it[1].iteritems():
>        print "col='%s', len = %d" % (d[0], len(d[0]))
>        pass
>    pass
>
> -------------------------
>
>
> Regards,
> Shotaro
>
>
>
>
> On Fri, Feb 18, 2011 at 5:19 AM, Aaron Morton <aaron@thelastpickle.com>
> wrote:
> > First some terminology, when you say range slice do you mean getting
> multiple rows? Or do you mean get_slice where you return multiple super
> columns from one row?
> >
> > Your examples looks like you want to get multiple super columns from one
> row. In which case the choice of partitioner is not important. The
> comparator and sub comparator as specified in the CF definition control the
> ordering of colums. If possible i would suggest using the random
> partitioner.
> >
> > Could you provide examples of how you are doing the queries using pycassa
> we may be able to help.
> >
> > My initial guess is that the ranges you specify for the query are not
> correct when using ASCII ordering for column names, e,g,
> >
> > 20031210 < 20031210022059/190209-20031210-4476885-s/z is true
> >
> > 20031210022059/190209-20031210-4476885-s/z < 20031210 is not true
> >
> > Trying appending the highest value ASCII character to the end of 20031210
> >
> > Cheers
> > Aaron
> >
> > On 18/02/2011, at 4:35 AM, Shotaro Kamio <kamioshot@gmail.com> wrote:
> >
> >> Hi,
> >>
> >> We are in trouble with a strange behavior in cassandra 0.7.2 (also
> >> happened in 0.7.0). Could someone help us?
> >>
> >> The problem happens on a column family of super column type named
> "Order".
> >> Data structure is something like:
> >>  Order[ a_key ][ date + "/" + order_id + "/" (+ suffix) ][attribute] =
> value
> >>
> >> For example,
> >> Order[ "100" ][ "20031210022059/190209-20031210-4476885-s/" ]
> >> is a super column.
> >> Because we want to scan them in the latest-first order, range slice
> >> query with reversed order is used. (Partitioner is
> >> ByteOrderedPartitioner).
> >>
> >> In some supercolumns in my cassandra instance, reversed query returns
> >> no result while it should have results.
> >> For instance,
> >>
> >> * Range slice in normal (lexical)-order ( Order[ "100" ] [ from
> >> "20031210" to "20031210022059/190209-20031210-4476885-s/z" ] ) will
> >> return results correctly.
> >>
> >> col='20031210014347/190209-20031210-4476668-s/'
> >> col='20031210014347/190209-20031210-4476668-s/0'
> >> col='20031210022059/190209-20031210-4476885-s/'
> >> col='20031210022059/190209-20031210-4476885-s/0'
> >>
> >> * Range slice in reversed (latest-first)-order ( Order[ "100" ] [ from
> >> "20031210022059/190209-20031210-4476885-s/z" to  "20031210" ] ) will
> >> return NO result!
> >>
> >> Note that the super column name
> >> "20031210022059/190209-20031210-4476885-s/z" doesn't exist. The query
> >> should work. And, it succeeds in other super columns.
> >>
> >> * Range slice in reversed (latest-first)-order starting from existing
> >> column name ( Order[ "100" ] [ from
> >> "20031210022059/190209-20031210-4476885-s/0" to "20031210" ] ) will
> >> return results which should return.
> >>
> >> Both pycassa and hector show the same behavior on the same column
> >> name. I guess that cassandra has some logical error.
> >>
> >>
> >> I'll appreciate any help.
> >>
> >>
> >> Best reagards,
> >> Shotaro
> >
>
>
>
> --
> Shotaro Kamio
>


-- 
Tyler Hobbs
Software Engineer, DataStax <http://datastax.com/>
Maintainer of the pycassa <http://github.com/pycassa/pycassa> Cassandra
Python client library

--0015177fcec4a2100b049c890e4c
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I&#39;m unable to reproduce this in pycassa starting with a clean database.=
=A0 Are you doing anything else to these rows besides inserting them?<br><b=
r>Here&#39;s the complete script I&#39;m using below.=A0 Could you confirm =
that this causes problems for you?<br>
<br>- Tyler<br><br>=3D=3D=3D=3D=3D=3D=3D=3D=3D<br><br>import sys<br>import =
pycassa<br><br>pool =3D pycassa.ConnectionPool(&#39;Keyspace1&#39;)<br>cf =
=3D pycassa.ColumnFamily(pool, &#39;Super1&#39;)<br><br>KEY =3D &#39;key=
9;<br><br>columns =3D [<br>
=A0=A0=A0 &quot;20031210020333/190209-20031210-4476807-s/&quot;=A0 , #0<br>=
=A0=A0=A0 &quot;20031210020333/190209-20031210-4476807-s/0&quot; , #1<br>=
=A0=A0=A0 &quot;20031210021940/190209-20031210-4476883-s/&quot;=A0 , #2<br>=
=A0=A0=A0 &quot;20031210021940/190209-20031210-4476883-s/0&quot; , #3<br>
=A0=A0=A0 &quot;20031210022059/190209-20031210-4476885-s/&quot;=A0 , #4<br>=
=A0=A0=A0 &quot;20031210022059/190209-20031210-4476885-s/0&quot; , #5<br>=
=A0=A0=A0 # &lt;--Problem_around_here.<br>=A0=A0=A0 &quot;20031210022154/19=
0209-20031210-4476888-s/&quot;=A0 , #6<br>
=A0=A0=A0 &quot;20031210022154/190209-20031210-4476888-s/0&quot;=A0=A0 #7<b=
r>]<br><br>for supercolumn in columns:<br>=A0=A0=A0 cf.insert(KEY, {superco=
lumn: {&#39;subcol&#39;: &#39;subval&#39;, &#39;subcol2&#39;: &#39;subval&#=
39;}})<br><br>
def get_cols(start_date, end_date, reversed):<br>=A0=A0=A0 for key, cols in=
 cf.get_range(start =3D KEY,<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 finish =3D KEY,<b=
r>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0 column_reversed=3Dreversed,<br>=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 column_count=3D10000,<br>
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0 column_start=3Dstart_date,<br>=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0 column_finish=3Dend_date):<br>=A0=A0=A0=A0=A0=A0=A0 for supercol, su=
bcols in cols.iteritems():<br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 print &quot=
;col=3D&#39;%s&#39; \tlen =3D %d&quot; % (supercol, len(subcols))<br>
<br>start =3D 0<br>for end in [0,3,5,7]:<br>=A0=A0=A0 print &quot;\nstart %=
d, end %d + &#39;z&#39;&quot; % (start, end)<br>=A0=A0=A0 get_cols(columns[=
start], columns[end] + &#39;z&#39;, False)<br><br>end =3D 0<br>for start in=
 [0, 3, 5, 7]:<br>
=A0=A0=A0 print &quot;\nstart %d + &#39;z&#39;, end %d (reversed)&quot; % (=
start, end)<br>=A0=A0=A0 get_cols(columns[end], columns[start] + &#39;z&#39=
;, False)<br><br><br><div class=3D"gmail_quote">On Thu, Feb 17, 2011 at 11:=
09 PM, Shotaro Kamio <span dir=3D"ltr">&lt;<a href=3D"mailto:kamioshot@gmai=
l.com">kamioshot@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin: 0pt 0pt 0pt 0.8ex; borde=
r-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">Hi Aaron,<br>
<br>
Range slice means get_range_slices() in thrift api,<br>
createSuperSliceQuery in hector, get_range() in pycassa. The example<br>
code in pycassa is attached below.<br>
<br>
The problem is a little bit complicated to explain. I&#39;ll try to<br>
describe in examples.<br>
Here are 8 super column names which exist in the specific key. The<br>
list is forward order.<br>
<br>
#0: &quot;20031210020333/190209-20031210-4476807-s/&quot;<br>
#1: &quot;20031210020333/190209-20031210-4476807-s/0&quot;<br>
#2: &quot;20031210021940/190209-20031210-4476883-s/&quot;<br>
#3: &quot;20031210021940/190209-20031210-4476883-s/0&quot;<br>
#4: &quot;20031210022059/190209-20031210-4476885-s/&quot;<br>
#5: &quot;20031210022059/190209-20031210-4476885-s/0&quot; =A0&lt;-- Proble=
m around here.<br>
#6: &quot;20031210022154/190209-20031210-4476888-s/&quot;<br>
#7: &quot;20031210022154/190209-20031210-4476888-s/0&quot;<br>
<br>
There is no problem if I use the super column names exist on the key.<br>
<br>
* Range from #0 to #3 in forward order -&gt; OK<br>
* Range from #0 to #5 in forward order -&gt; OK<br>
* Range from #0 to #7 in forward order -&gt; OK<br>
<br>
* Range from #7 to #0 in reverse order -&gt; OK<br>
* Range from #5 to #0 in reverse order -&gt; OK<br>
* Range from #3 to #0 in reverse order -&gt; OK<br>
<br>
<br>
Because I want to scan orders in a certain range, however, I use<br>
column names which added character &quot;z&quot; (higher than anything in<b=
r>
order_id). Those column names are listed below as #1z, #3z, #5z and<br>
#7z. Note that these super column names don&#39;t really exist on the key.<=
br>
(#4+ is a column name to locate between #4 and #5)<br>
<br>
#0 : &quot;20031210020333/190209-20031210-4476807-s/&quot;<br>
#1 : &quot;20031210020333/190209-20031210-4476807-s/0&quot;<br>
#1z: &quot;20031210020333/190209-20031210-4476807-s/z&quot; (don&#39;t exis=
t)<br>
#2 : &quot;20031210021940/190209-20031210-4476883-s/&quot;<br>
#3 : &quot;20031210021940/190209-20031210-4476883-s/0&quot;<br>
#3z: &quot;20031210021940/190209-20031210-4476883-s/z&quot; (don&#39;t exis=
t)<br>
#4 : &quot;20031210022059/190209-20031210-4476885-s/&quot;<br>
#4+: &quot;20031210022059/190209-20031210-4476885-s/+&quot; (don&#39;t exis=
t)<br>
#5 : &quot;20031210022059/190209-20031210-4476885-s/0&quot; =A0&lt;-- Probl=
em around here.<br>
#5z: &quot;20031210022059/190209-20031210-4476885-s/z&quot; (don&#39;t exis=
t)<br>
#6 : &quot;20031210022154/190209-20031210-4476888-s/&quot;<br>
#7 : &quot;20031210022154/190209-20031210-4476888-s/0&quot;<br>
#7z: &quot;20031210022154/190209-20031210-4476888-s/z&quot; (don&#39;t exis=
t)<br>
<br>
Then, try to range slice them.<br>
<br>
* Range from #0 to #3z in forward order -&gt; OK<br>
* Range from #0 to #4+ in forward order -&gt; OK<br>
* Range from #0 to #5z in forward order -&gt; OK<br>
* Range from #0 to #7z in forward order -&gt; OK<br>
<br>
* Range from #7z to #0 in reverse order -&gt; OK<br>
* Range from #5z to #0 in reverse order -&gt; FAIL (no result)<br>
* Range from #4+ to #0 in reverse order -&gt; OK<br>
* Range from #3z to #0 in reverse order -&gt; OK<br>
<br>
The problem happens in this case. No error or warning is shown in cassandra=
 log.<br>
<br>
Also, I tried dumping data into json via sstable2json and restored it<br>
with json2sstable. But the same problem occurs.<br>
<br>
<br>
The code I used for the test is something like this.<br>
----------------------<br>
client =3D pycassa.connect(KEYSPACE, [ CASSANDRA_HOST ])<br>
cf =3D pycassa.ColumnFamily(client, COLUMN_FAMILY)<br>
<br>
columns =3D [<br>
&quot;20031210020333/190209-20031210-4476807-s/&quot; =A0, #0<br>
&quot;20031210020333/190209-20031210-4476807-s/0&quot; , #1<br>
&quot;20031210021940/190209-20031210-4476883-s/&quot; =A0, #2<br>
&quot;20031210021940/190209-20031210-4476883-s/0&quot; , #3<br>
&quot;20031210022059/190209-20031210-4476885-s/&quot; =A0, #4<br>
&quot;20031210022059/190209-20031210-4476885-s/0&quot; , #5<br>
# &lt;--Problem_around_here.<br>
&quot;20031210022154/190209-20031210-4476888-s/&quot; =A0, #6<br>
&quot;20031210022154/190209-20031210-4476888-s/0&quot; =A0 #7<br>
]<br>
<br>
reversed =3D False<br>
if len(sys.argv) &gt; 1:<br>
 =A0 =A0# use reversed order if &quot;-r&quot; option is given. &quot;-f&qu=
ot; or others for<br>
forward order, no option will list all column names.<br>
 =A0 =A0reversed =3D (sys.argv[1] =3D=3D &#39;-r&#39;)<br>
<br>
 =A0 =A0start_date =3D columns[0]<br>
 =A0 =A0end_date =A0=3D columns[7] + &quot;z&quot; # add &quot;z&quot; to m=
ake problem.<br>
<br>
 =A0 =A0if reversed:<br>
 =A0 =A0 =A0 =A0temp =3D start_date<br>
 =A0 =A0 =A0 =A0start_date =3D end_date<br>
 =A0 =A0 =A0 =A0end_date =A0 =3D temp<br>
 =A0 =A0 =A0 =A0pass<br>
else:<br>
 =A0 =A0start_date =3D end_date =3D &#39;&#39;<br>
 =A0 =A0pass<br>
<br>
print &quot;start_date =3D&quot;, start_date, &quot;end_date =3D&quot;, end=
_date, &quot;reversed =3D<br>
&quot;, reversed<br>
<br>
for it in cf.get_range(start =3D A_KEY, finish =3D A_KEY,<br>
column_reversed=3Dreversed, column_count=3D10000, column_start=3Dstart_date=
,<br>
column_finish=3Dend_date):<br>
<br>
 =A0 =A0for d in it[1].iteritems():<br>
 =A0 =A0 =A0 =A0print &quot;col=3D&#39;%s&#39;, len =3D %d&quot; % (d[0], l=
en(d[0]))<br>
 =A0 =A0 =A0 =A0pass<br>
 =A0 =A0pass<br>
<br>
-------------------------<br>
<br>
<br>
Regards,<br>
Shotaro<br>
<div><div></div><div class=3D"h5"><br>
<br>
<br>
<br>
On Fri, Feb 18, 2011 at 5:19 AM, Aaron Morton &lt;<a href=3D"mailto:aaron@t=
helastpickle.com">aaron@thelastpickle.com</a>&gt; wrote:<br>
&gt; First some terminology, when you say range slice do you mean getting m=
ultiple rows? Or do you mean get_slice where you return multiple super colu=
mns from one row?<br>
&gt;<br>
&gt; Your examples looks like you want to get multiple super columns from o=
ne row. In which case the choice of partitioner is not important. The compa=
rator and sub comparator as specified in the CF definition control the orde=
ring of colums. If possible i would suggest using the random partitioner.<b=
r>

&gt;<br>
&gt; Could you provide examples of how you are doing the queries using pyca=
ssa we may be able to help.<br>
&gt;<br>
&gt; My initial guess is that the ranges you specify for the query are not =
correct when using ASCII ordering for column names, e,g,<br>
&gt;<br>
&gt; 20031210 &lt; 20031210022059/190209-20031210-4476885-s/z is true<br>
&gt;<br>
&gt; 20031210022059/190209-20031210-4476885-s/z &lt; 20031210 is not true<b=
r>
&gt;<br>
&gt; Trying appending the highest value ASCII character to the end of 20031=
210<br>
&gt;<br>
&gt; Cheers<br>
&gt; Aaron<br>
&gt;<br>
&gt; On 18/02/2011, at 4:35 AM, Shotaro Kamio &lt;<a href=3D"mailto:kamiosh=
ot@gmail.com">kamioshot@gmail.com</a>&gt; wrote:<br>
&gt;<br>
&gt;&gt; Hi,<br>
&gt;&gt;<br>
&gt;&gt; We are in trouble with a strange behavior in cassandra 0.7.2 (also=
<br>
&gt;&gt; happened in 0.7.0). Could someone help us?<br>
&gt;&gt;<br>
&gt;&gt; The problem happens on a column family of super column type named =
&quot;Order&quot;.<br>
&gt;&gt; Data structure is something like:<br>
&gt;&gt; =A0Order[ a_key ][ date + &quot;/&quot; + order_id + &quot;/&quot;=
 (+ suffix) ][attribute] =3D value<br>
&gt;&gt;<br>
&gt;&gt; For example,<br>
&gt;&gt; Order[ &quot;100&quot; ][ &quot;20031210022059/190209-20031210-447=
6885-s/&quot; ]<br>
&gt;&gt; is a super column.<br>
&gt;&gt; Because we want to scan them in the latest-first order, range slic=
e<br>
&gt;&gt; query with reversed order is used. (Partitioner is<br>
&gt;&gt; ByteOrderedPartitioner).<br>
&gt;&gt;<br>
&gt;&gt; In some supercolumns in my cassandra instance, reversed query retu=
rns<br>
&gt;&gt; no result while it should have results.<br>
&gt;&gt; For instance,<br>
&gt;&gt;<br>
&gt;&gt; * Range slice in normal (lexical)-order ( Order[ &quot;100&quot; ]=
 [ from<br>
&gt;&gt; &quot;20031210&quot; to &quot;20031210022059/190209-20031210-44768=
85-s/z&quot; ] ) will<br>
&gt;&gt; return results correctly.<br>
&gt;&gt;<br>
&gt;&gt; col=3D&#39;20031210014347/190209-20031210-4476668-s/&#39;<br>
&gt;&gt; col=3D&#39;20031210014347/190209-20031210-4476668-s/0&#39;<br>
&gt;&gt; col=3D&#39;20031210022059/190209-20031210-4476885-s/&#39;<br>
&gt;&gt; col=3D&#39;20031210022059/190209-20031210-4476885-s/0&#39;<br>
&gt;&gt;<br>
&gt;&gt; * Range slice in reversed (latest-first)-order ( Order[ &quot;100&=
quot; ] [ from<br>
&gt;&gt; &quot;20031210022059/190209-20031210-4476885-s/z&quot; to =A0&quot=
;20031210&quot; ] ) will<br>
&gt;&gt; return NO result!<br>
&gt;&gt;<br>
&gt;&gt; Note that the super column name<br>
&gt;&gt; &quot;20031210022059/190209-20031210-4476885-s/z&quot; doesn&#39;t=
 exist. The query<br>
&gt;&gt; should work. And, it succeeds in other super columns.<br>
&gt;&gt;<br>
&gt;&gt; * Range slice in reversed (latest-first)-order starting from exist=
ing<br>
&gt;&gt; column name ( Order[ &quot;100&quot; ] [ from<br>
&gt;&gt; &quot;20031210022059/190209-20031210-4476885-s/0&quot; to &quot;20=
031210&quot; ] ) will<br>
&gt;&gt; return results which should return.<br>
&gt;&gt;<br>
&gt;&gt; Both pycassa and hector show the same behavior on the same column<=
br>
&gt;&gt; name. I guess that cassandra has some logical error.<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt; I&#39;ll appreciate any help.<br>
&gt;&gt;<br>
&gt;&gt;<br>
&gt;&gt; Best reagards,<br>
&gt;&gt; Shotaro<br>
&gt;<br>
<br>
<br>
<br>
</div></div>--<br>
<font color=3D"#888888">Shotaro Kamio<br>
</font></blockquote></div><br><br clear=3D"all"><br>-- <br><font color=3D"#=
888888">Tyler Hobbs<span></span><br>
Software Engineer, <a href=3D"http://datastax.com/" target=3D"_blank">DataS=
tax</a><br>Maintainer of the <a href=3D"http://github.com/pycassa/pycassa" =
target=3D"_blank">pycassa</a> Cassandra Python client library<br></font><br=
>

--0015177fcec4a2100b049c890e4c--