The only thing I can think of is that values need to be in the correct byte format when used in indexes in 0.7. Take a look at the types.py module in the pycassa client http://github.com/pycassa/pycassa
for an example of which values need to be byte packed.
How is your pig function working against cassandra? Is it using the ColumnFamilyRecordReader? . The code in the internal RowIterator for that class has an example calling the cluster to get to the comparators.
On 27 Sep, 2010,at 03:11 AM, Christian Decker <email@example.com> wrote:
what changes can I expect in the 0.7 release regarding Comparison and Parameters? My problem is mainly that I want to take Strings from stdin (or Pig Scripts for that matter) and convert them in such a way that they are interpreted correctly and converted to the corresponding byte representation to use them in column names and keys.
On Sun, Sep 26, 2010 at 5:20 AM, Aaron Morton <firstname.lastname@example.org>
Things a changing in v0.7, the row keys are byte arrays.
Not sure I understand your other concerns.
Thanks for your quick answer, I think I'll use an affix to sort of cast the keys, ranges and others from their textual representation (from Pig) to the desired byte representation, since I just noticed that the keys for the rows themselfs are always UTF8 interpreted, and since I want to make key-range as well as slice queries, I'll be better off this way I think. I'll just add a 'L' for Long and 'U' for UUID (of any kind).
Or is there a better way that I just can't see from my beginners angle? :-)thing
On Fri, Sep 24, 2010 at 6:35 PM, Tyler Hobbs <email@example.com>
Yes, you can use describe_keyspace() and then look through the results. It's a little ugly in 0.6, but it works
On Fri, Sep 24, 2010 at 11:25 AM, Christian Decker <firstname.lastname@example.org>
Well I'm writing a loading function for Pig, and as it happens I want to be able to load slices from cassandra which are specified in the pig script (thus the input from stdin) but the ColumnFamily from which to read the data is another parameter and some of the CFs have UTF8, UUID, TimeUUID or Long types for their keys and columns, so simply converting everything I get to an 8byte long would break compatibility with the others.
Now thinking about it I attacked the whole problem in a weird way, since UUID types won't work either.
So let me change my question slightly, is there a way in 0.6 to detect the compareWith type on a running cluster? That way I could convert it to the right type :D
On Fri, Sep 24, 2010 at 6:09 PM, Tyler Hobbs <email@example.com>
I'm not sure I understand why using this with multiple column families prevents you from converting it. Could you clarify this?
On Fri, Sep 24, 2010 at 10:56 AM, Christian Decker <firstname.lastname@example.org>
I'm having quite a dilemma with the CompareWith attribute. The Problem is that I have numeric IDs that I'd like to use as row keys, only that I also have to offer a possibility to let users input them from std input. Since I cannot ask my users to input an 8byte sequence representing the ID they'd like, I was about to turn to UTF8, when I remembered that they are compared lexicographically, so that 100 actually comes before 2, which kills key slices. Also I cannot just code a converter in since this is supposed to be a used with multiple columnfamilies, so just converting an integer read into 8bytes isn't going to work either.
Any tricks for this one?