incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@riptano.com>
Subject Re: Sorting problem on supercolumns names using OPP on 0.6.2
Date Mon, 06 Dec 2010 18:29:22 GMT
How are you packing the longs into strings?  The large negative numbers
point to that being done incorrectly.

Bitshifting and putting each byte of the long into a char[8] then
stringifying the char[] is the best way to go.  Cassandra expects
big-ending longs, as well.

- Tyler

On Mon, Dec 6, 2010 at 11:55 AM, Guillermo Winkler <gwinkler@inconcertcc.com
> wrote:

> I'm using thrift in C++ and inserting the results in a vector of pairs, so
> client-side-mangling does not seem to be the problem.
>
> Also I'm using a "test" column where I insert the same value I'm using as
> super column name (in this case the same date converted to string) and when
> queried using cassandra cli is unsorted too:
>
> cassandra> get Events.EventsByUserDate ['guille']
> => (super_column=9088542550893002752,
>
> (column=4342323443303834363833383437454339364433324530324538413039373736,
> value=2010-12-06 17:43:36.000, timestamp=1291657416526732))
> => (super_column=5990347482238812160,
>
> (column=41414e4c6b54696d6532423656566e6869667a336f654b6147393d2d395a4e797441397a744f39686d3147392b406d61696c2e676d61696c2e636f6d,
> value=2010-12-06 17:46:08.000, timestamp=1291657568569039))
> => (super_column=-3089190841516818432,
>
> (column=3634343644353236463830303437363542454245354630343845393533373337,
> value=2010-12-06 17:44:47.000, timestamp=1291657487450738))
> => (super_column=-4026221038986592256,
>
> (column=62303232396330372d636430612d343662332d623834382d393632366136323061376532,
> value=2010-12-06 17:39:50.000, timestamp=1291657190117981))
>
>
>
>
> On Mon, Dec 6, 2010 at 3:02 PM, Tyler Hobbs <tyler@riptano.com> wrote:
>
>> What client are you using?  Is it storing the results in a hash map or
>> some other type of
>> non-order preserving dictionary?
>>
>> - Tyler
>>
>>
>> On Mon, Dec 6, 2010 at 10:11 AM, Guillermo Winkler <
>> gwinkler@inconcertcc.com> wrote:
>>
>>> Hi, I've the following schema defined:
>>>
>>> EventsByUserDate : {
>>>  UserId : {
>>> epoch: { // SC
>>>  IID,
>>> IID,
>>> IID,
>>>  IID
>>> },
>>> // and the other events in time
>>>  epoch: {
>>> IID,
>>> IID,
>>>  IID
>>> }
>>> }
>>> }
>>> <ColumnFamily ColumnType="Super" CompareWith="LongType"
>>> CompareSubcolumnsWith="BytesType" Name="EventsByUserDate "/>
>>>
>>> Where I'm expecting to store all the event ids for a user ordered by date
>>> (it's seconds since epoch as long long), I'm using
>>> OrdingPreservingPartitioner.
>>>
>>> But a call to:
>>>
>>> GetSuperRangeSlices("EventsByUserDate ",  --column family
>>> "",  --supercolumn
>>>  userId, --startkey
>>> userId, --endkey
>>>  {
>>>     column_names = {},
>>>    slice_range = {
>>>      start = "",
>>>       finish = "",
>>>      reversed = true,
>>>                                      count = 20} },
>>>                                 1 --total keys
>>>                )
>>>
>>> Is not sorting correctly by supercolumn (the supercolumn names come out
>>> unsorted), this is a sample output for the pervious query using thrift
>>> directly:
>>>
>>> SC 1291648883
>>> SC 1291588465
>>> SC 1291588453
>>> SC 1291586385
>>> SC 1291587408
>>> SC 1291588174
>>> SC 1291585331
>>> SC 1291587116
>>> SC 1291651116
>>> SC 1291586332
>>> SC 1291588548
>>> SC 1291588036
>>> SC 1291648703
>>> SC 1291583651
>>> SC 1291583650
>>> SC 1291583649
>>> SC 1291583648
>>> SC 1291583647
>>> SC 1291583646
>>> SC 1291587485
>>>
>>>
>>> Anything I'm missing regarding sorting schemes?
>>>
>>> Thanks,
>>> Guille
>>>
>>>
>>
>

Mime
View raw message