cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark <static.void....@gmail.com>
Subject Re: Columns limit
Date Sat, 07 Aug 2010 04:57:50 GMT
On 8/6/10 6:36 PM, Benjamin Black wrote:
> Same answer as on other thread right now about how to index:
>
> http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/
> http://www.slideshare.net/benjaminblack/cassandra-basics-indexing
>
> On Fri, Aug 6, 2010 at 6:18 PM, Mark<static.void.dev@gmail.com>  wrote:
>    
>> On 8/6/10 4:50 PM, Thomas Heller wrote:
>>      
>>>> Thanks for the suggestion.
>>>>
>>>> I've somewhat understand all that, the point where my head begins to
>>>> explode
>>>> is when I want to figure out something like
>>>>
>>>> Continuing with your example: "Over the last X amount of days give me all
>>>> the logs for remote_addr:XXX".
>>>> I'm guessing I would need to create a separate index ColumnFamily???
>>>>
>>>>
>>>>          
>>> Depending on your needs you can either insert them directly or pull
>>> them out later in some map/reduce fashion. What you want is another
>>> column Family and a similar structure.
>>>
>>> ColumnFamily Standard "LogByRemoteAddrAndDate" CompareWith: TimeUUID
>>>
>>> Row: "127.0.0.1:20100806" Column TimeUUID/JSON as usual. If you want
>>> to "link" to the actual log record (to avoid writing if multiple
>>> times) just insert the same timeuuid you inserted into the other CF
>>> and leave the value empty. So you have your "Index", aka list of
>>> column names, and you can look up the actual values using get_slice
>>> with column_names.
>>>
>>> Confusing at first, but really quite simple once you get used to the
>>> idea. Just alot more work then letting SQL do it for you. ;)
>>>
>>> HTH,
>>> /thomas
>>>
>>>        
>> Ok, I think the part I was missing was the concatenation of the key and
>> partition to do the look ups. Is this the preferred way of accomplishing
>> needs such as this? Are there alternatives ways?
>>
>> How would one then "query" over multiple days? Same question for all days.
>> Should I use range_slice or multiget_slice? And if its range_slice does that
>> mean I need OrderPreservingPartitioner?
>>
>>
>>
>>      
Sweet. Thanks for the links.

Mime
View raw message