incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Jackson <robe...@promedicalinc.com>
Subject Re: How to make use of Cassandra raw row keys?
Date Wed, 25 May 2011 23:05:51 GMT
If you are using 0.10 or 0.11 of the cassandra gem you will only get rows back that have values(columns).
This is due to the way cassandra handles deleted rows by adding a tombstone. So if you delete
a row (or delete all the columns in a row) the gem will remove that particular row from the
hash before it is returned.

Also, the gem sets the :key_count to 100 by default. This combined with the delete behavior
(if a decent number of deletes have been done) could mean that only 8 rows (in your example)
exist  that contain values in the first 100 rows returned.

If you pass :key_count to get_range it will continue querying in batches (of 100 by default)
until the requested number of rows are returned. 

Hope this sheds some light on the cassandra gem internals...

Robert Jackson

Sent from my iPhone

On May 25, 2011, at 6:22 PM, aaron morton <aaron@thelastpickle.com> wrote:

> Hard to say exactly what the issue is. Are they connected to the same node and using
the same Consistency Level?
> 
> Try turing the logging up to DEBUG to see they are issuing the same query. 
> 
> Hope that helps. 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 26 May 2011, at 02:28, Suan Aik Yeo wrote:
> 
>> Thanks, that definitely helped. Any idea why my client is showing far less existing
rows than cassandra-cli though?
>> 
>> I'm using the Ruby Cassandra client, and when I get all the rows for the "Sessions"
cf, I get 8 rows returned. However, when I do "list Sessions" in the cassandra-cli I get 40
rows returned! Is it that the cli when return even the rows whose TTL has expired? Any other
reasons?
>> 
>> 
>> Thanks,
>> Suan
>> 
>> On Tue, May 24, 2011 at 11:07 PM, Aaron Morton <aaron@thelastpickle.com> wrote:
>> The key printed in the DEBUG message is the byte array the server was given as the
key converted to hex. Your client API may have converted the string to ascii bytes before
sending to the server.
>> 
>> e.g. here is me writing a 'foo' key to the server 
>> DEBUG 15:52:15,818 insert writing local RowMutation(keyspace='dev', key='666f6f',
modifications=[data])
>> 
>> 
>> You can tell the CLI what data type the keys are, see the assume statement. e.g.
assume my_cf keys as ascii; Will tell the cli to convert them back to ascii for you.
>> 
>> Hope that helps. 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 25 May, 2011,at 03:17 PM, Suan Aik Yeo <yeosuanaik@gmail.com> wrote:
>> 
>>> We're using Cassandra to store our sessions, all in a single column family "Sessions"
with the format:
>>> Sessions['session_key'] = {'val': <actual_value>}
>>> (session_key is a randomly generated hash)
>>> 
>>> The "raw" keys I'm talking about are for example the 'key' value as seen from
Cassandra DEBUG output:
>>> insert writing local RowMutation(keyspace='my_keyspace', key='73657373696f6e3a6365613765323931353838616437343732363130646163666331643161393334',
modifications=[Sessions])
>>> 
>>> Today we ran into a problem where a session with a given key (say "session:12345")
seemingly disappeared (at least it appeared that way to the client app), but in the server
log DEBUG output, the "raw" Cassandra key that seemed to correspond to that session_key (say
"a12345f") was still being used as evidenced by DEBUG log output. Indeed, none of the existing
session_keys corresponded to the "a12345f" raw key. However, in Cassandra-cli when I do the
"list Sessions" command, the "a12345f" raw key shows up as part of the output.
>>> 
>>> I'd like to dig further into the issue, but first I need to find out:
>>> what are these keys and how are they determined?
>>> Is there any way I could use them in querying Cassandra to find out what they're
pointing to? (Seems that even the cli expects the "session:12345" type key rather than raw
ones when querying)
>>> 
>>> 
>>> Thanks,
>>> Suan
>> 
> 

Mime
View raw message