incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin>
Subject Re: Columns limit
Date Sat, 07 Aug 2010 01:36:45 GMT
Same answer as on other thread right now about how to index:

On Fri, Aug 6, 2010 at 6:18 PM, Mark <> wrote:
> On 8/6/10 4:50 PM, Thomas Heller wrote:
>>> Thanks for the suggestion.
>>> I've somewhat understand all that, the point where my head begins to
>>> explode
>>> is when I want to figure out something like
>>> Continuing with your example: "Over the last X amount of days give me all
>>> the logs for remote_addr:XXX".
>>> I'm guessing I would need to create a separate index ColumnFamily???
>> Depending on your needs you can either insert them directly or pull
>> them out later in some map/reduce fashion. What you want is another
>> column Family and a similar structure.
>> ColumnFamily Standard "LogByRemoteAddrAndDate" CompareWith: TimeUUID
>> Row: "" Column TimeUUID/JSON as usual. If you want
>> to "link" to the actual log record (to avoid writing if multiple
>> times) just insert the same timeuuid you inserted into the other CF
>> and leave the value empty. So you have your "Index", aka list of
>> column names, and you can look up the actual values using get_slice
>> with column_names.
>> Confusing at first, but really quite simple once you get used to the
>> idea. Just alot more work then letting SQL do it for you. ;)
>> HTH,
>> /thomas
> Ok, I think the part I was missing was the concatenation of the key and
> partition to do the look ups. Is this the preferred way of accomplishing
> needs such as this? Are there alternatives ways?
> How would one then "query" over multiple days? Same question for all days.
> Should I use range_slice or multiget_slice? And if its range_slice does that
> mean I need OrderPreservingPartitioner?

View raw message