incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark <static.void....@gmail.com>
Subject Re: Columns limit
Date Sat, 07 Aug 2010 01:18:18 GMT
On 8/6/10 4:50 PM, Thomas Heller wrote:
>> Thanks for the suggestion.
>>
>> I've somewhat understand all that, the point where my head begins to explode
>> is when I want to figure out something like
>>
>> Continuing with your example: "Over the last X amount of days give me all
>> the logs for remote_addr:XXX".
>> I'm guessing I would need to create a separate index ColumnFamily???
>>
>>      
> Depending on your needs you can either insert them directly or pull
> them out later in some map/reduce fashion. What you want is another
> column Family and a similar structure.
>
> ColumnFamily Standard "LogByRemoteAddrAndDate" CompareWith: TimeUUID
>
> Row: "127.0.0.1:20100806" Column TimeUUID/JSON as usual. If you want
> to "link" to the actual log record (to avoid writing if multiple
> times) just insert the same timeuuid you inserted into the other CF
> and leave the value empty. So you have your "Index", aka list of
> column names, and you can look up the actual values using get_slice
> with column_names.
>
> Confusing at first, but really quite simple once you get used to the
> idea. Just alot more work then letting SQL do it for you. ;)
>
> HTH,
> /thomas
>    
Ok, I think the part I was missing was the concatenation of the key and 
partition to do the look ups. Is this the preferred way of accomplishing 
needs such as this? Are there alternatives ways?

How would one then "query" over multiple days? Same question for all 
days. Should I use range_slice or multiget_slice? And if its range_slice 
does that mean I need OrderPreservingPartitioner?



Mime
View raw message