cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <>
Subject Re: LongType from user input
Date Thu, 30 Sep 2010 19:51:42 GMT
It's specified as part of the schema functions such as system_add_keyspace So it will depend
on which language/client you are using to add the keyspace. 

Take a look at /test/system/ in test_column_validators() line 1237  for
an example of using the ThriftAPI in Python (you would normally use a higher level api) and
take a look at interface/cassandra.thrift for the raw interface definition. The sample Keyspace1
that is included in the default cassandra.yaml also includes a CF with an indexed column (Indexed1).

Specifically, a KsDef has a list of CFDef's. THe CfDef has an optional list of ColumnDef's
, indexed columns are defined in the ColumnDef. 

AFAIK KEYS is the only index_type at the moment, and EQUAL is the only operator supported. 


On 01 Oct, 2010,at 08:39 AM, Christian Decker <> wrote:

Sorry for my continued questioning, by now I must have bored everyone about this topic. I
have one last question: how do you create such an index? I found only ColumnFamilyStoreText.testIndexCreate(),
but it appears that this uses internal API, and thus cannot be invoked from a client side
or through JMX, right? From there on I should be able to find my way around the system :-)

On Thu, Sep 30, 2010 at 4:56 PM, Stu Hood <> wrote:
Take a look at the get_indexed_slices method in the 0.7.0-beta Thrift interface.

-----Original Message-----
From: "Christian Decker" <>
Sent: Thursday, September 30, 2010 4:38am
Subject: Re: LongType from user input

I just read through the tickets on Jira, and it appears that indices are
implemented in the 0.7 source tree, but I cannot find any pointer on how to
use them. I'll be trying to create a custom CassandraStorage that loads data
through the indices, anyone else interested?


On Thu, Sep 30, 2010 at 10:56 AM, Aaron Morton <>wrote:

> AFAIK indexes are still in dev. The only example is in the
> in the source tree.
> Aaron
> On 30 Sep 2010, at 20:10, Christian Decker <>
> wrote:
> Apparently I have blanked the 0.7 completely out of my memory. I was trying
> to implement application layer indices and ignored the fact that Cassandra
> 0.7 is implementing them by default. I found ticket CASSANDRA-749 about the
> indices and am reading through the code right now, but is there a higher
> level overview and a tutorial on how to get things started with these
> indices (and maybe some inner workings)? This might actually solve all of my
> problems I'm having right now :-)
> Regards,
> Chris
> On Mon, Sep 27, 2010 at 3:45 AM, Aaron Morton < <>
>> wrote:
>> The only thing I can think of is that values need to be in the correct
>> byte format when used in indexes in 0.7. Take a look at the module
>> in the pycassa client  <>
>> for an example of which values need to
>> be byte packed.
>> How is your pig function working against cassandra? Is it using the
>> ColumnFamilyRecordReader? . The code in the internal RowIterator for that
>> class has an example calling the cluster to get to the comparators.
>> Aaron
>> On 27 Sep, 2010,at 03:11 AM, Christian Decker <<>
>>> wrote:
>> Hi Aaron,
>> what changes can I expect in the 0.7 release regarding Comparison and
>> Parameters? My problem is mainly that I want to take Strings from stdin (or
>> Pig Scripts for that matter) and convert them in such a way that they are
>> interpreted correctly and converted to the corresponding byte representation
>> to use them in column names and keys.
>> Regards,
>> Chris
>> On Sun, Sep 26, 2010 at 5:20 AM, Aaron Morton < <>
>>> wrote:
>>> Things a changing in v0.7, the row keys are byte arrays.
>>> Not sure I understand your other concerns.
>>> Aaron
>>> On 25 Sep 2010, at 08:10, Christian Decker <<>
>>>> wrote:
>>> Thanks for your quick answer, I think I'll use an affix to sort of cast
>>> the keys, ranges and others from their textual representation (from Pig) to
>>> the desired byte representation, since I just noticed that the keys for the
>>> rows themselfs are always UTF8 interpreted, and since I want to make
>>> key-range as well as slice queries, I'll be better off this way I think.
>>> I'll just add a 'L' for Long and 'U' for UUID (of any kind).
>>>  Or is there a better way that I just can't see from my beginners angle?
>>> :-)thing
>>> Regards,
>>> Chris
>>> On Fri, Sep 24, 2010 at 6:35 PM, Tyler Hobbs < <><>
>>>> wrote:
>>>> Yes, you can use describe_keyspace() and then look through the results
>>>> It's a little ugly in 0.6, but it works
>>>> - Tyler
>>>> On Fri, Sep 24, 2010 at 11:25 AM, Christian Decker <<><>
>>>>> wrote:
>>>>> Well I'm writing a loading function for Pig, and as it happens I want
>>>>> to be able to load slices from cassandra which are specified in the pig
>>>>> script (thus the input from stdin) but the ColumnFamily from which to
>>>>> the data is another parameter and some of the CFs have UTF8, UUID, TimeUUID
>>>>> or Long types for their keys and columns, so simply converting everything
>>>>> get to an 8byte long would break compatibility with the others.
>>>>> Now thinking about it I attacked the whole problem in a weird way,
>>>>> since UUID types won't work either.
>>>>> So let me change my question slightly, is there a way in 0.6 to detect
>>>>> the compareWith type on a running cluster? That way I could convert it
>>>>> the right type :D
>>>>> Regards,
>>>>> Chris
>>>>> On Fri, Sep 24, 2010 at 6:09 PM, Tyler Hobbs < <><>
>>>>>> wrote:
>>>>>> I'm not sure I understand why using this with multiple column families
>>>>>> prevents you from converting it.  Could you clarify this?
>>>>>> On Fri, Sep 24, 2010 at 10:56 AM, Christian Decker <<><>

>>>>>>> wrote:
>>>>>>> Hi all,
>>>>>>> I'm having quite a dilemma with the CompareWith attribute. The
>>>>>>> Problem is that I have numeric IDs that I'd like to use as row
keys, only
>>>>>>> that I also have to offer a possibility to let users input them
from std
>>>>>>> input. Since I cannot ask my users to input an 8byte sequence
>>>>>>> the ID they'd like, I was about to turn to UTF8, when I remembered
that they
>>>>>>> are compared lexicographically, so that 100 actually comes before
2, which
>>>>>>> kills key slices. Also I cannot just code a converter in since
this is
>>>>>>> supposed to be a used with multiple columnfamilies, so just converting
>>>>>>> integer read into 8bytes isn't going to work either.
>>>>>>> Any tricks for this one?
>>>>>>> Regards,
>>>>>>> Chris

  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message