cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Boxenhorn <da...@lookin2.com>
Subject Re: Range search on keys not working?
Date Wed, 09 Jun 2010 16:07:06 GMT
I don't get what you're saying. If you want to loop over your entire range
of keys, you can do it with a range query, and start and finish will both be
"". Is there any scenario where you would want to do a range query where
start and/or finish do not equal "", if you use random partitioning?

2010/6/9 Philip Stanhope <pstanhope@wimba.com>

> I feel that there is a significant bit of confusion here.
>
> You CAN use start/finish when using get_range_slices with random
> partitioner. But you can't make any assumptions about what key will be next
> in the range which is the whole point of "random". If you do know a specific
> key that you care about, you can use that as a start, but again, you don't
> know what will come next.
>
> If you have a CF with 1M keys ... you can effectively do a full row scan
> ... it is expensive and you'd have to ask yourself why you'd be wanting to
> do this in the first place.
>
> Ordering with columns for a particular key is completely dependent on the
> CompareWith choice you make when you defined the column family. For example,
> you can make assumptions about the sequencing of columns returned from
> get_slice (NOT get_range_slices).
>
> -phil
>
> On Jun 9, 2010, at 7:29 AM, David Boxenhorn wrote:
>
> To use start and finish parameters at all, you need to use OPP. Start and
> finish parameters don't work if you don't use OPP, i.e. the result set won't
> be:  start =< resultSet < finish
>
> 2010/6/9 Ben Browning <ben324@gmail.com>
>
>> OPP stands for Order-Preserving Partitioner. For more information on
>> partitioners, look here:
>>
>> http://wiki.apache.org/cassandra/StorageConfiguration#Partitioner
>>
>> To do key range slices that use both start and finish parameters and
>> retrieve keys in-order, you need to use an ordered partitioner -
>> either the built-in OPP or your own custom one.
>>
>> Ben
>>
>> On Tue, Jun 8, 2010 at 10:26 PM, sina <ywf2008@sina.com> wrote:
>> > what's the mean of opp? And How can i make the "start" and "finish"
>> useful
>> > and make sense?
>> >
>> >
>> > 2010-06-09
>> > ________________________________
>> > 9527
>> > ________________________________
>> > 发件人: Ben Browning
>> > 发送时间: 2010-06-02  21:08:57
>> > 收件人: user
>> > 抄送:
>> > 主题: Re: Range search on keys not working?
>> > They exist because when using OPP they are useful and make sense.
>> > On Wed, Jun 2, 2010 at 8:59 AM, David Boxenhorn <david@lookin2.com>
>> wrote:
>> >> So why do the "start" and "finish" range parameters exist?
>> >>
>> >> On Wed, Jun 2, 2010 at 3:53 PM, Ben Browning <ben324@gmail.com> wrote:
>> >>>
>> >>> Martin,
>> >>>
>> >>> On Wed, Jun 2, 2010 at 8:34 AM, Dr. Martin Grabmüller
>> >>> <Martin.Grabmueller@eleven.de> wrote:
>> >>> > I think you can specify an end key, but it should be a key which
>> does
>> >>> > exist
>> >>> > in your column family.
>> >>>
>> >>>
>> >>> Logically, it doesn't make sense to ever specify an end key with
>> >>> random partitioner. If you specified a start key of "aaa" and and end
>> >>> key of "aac" you might get back as results "aaa", "zfc", "hik", etc.
>> >>> And, even if you have a key of "aab" it might not show up. Key ranges
>> >>> only make sense with order-preserving partitioner. The only time to
>> >>> ever use a key range with random partitioner is when you want to
>> >>> iterate over all keys in the CF.
>> >>>
>> >>> Ben
>> >>>
>> >>>
>> >>> > But maybe I'm off the track here and someone else here knows more
>> about
>> >>> > this
>> >>> > key range stuff.
>> >>> >
>> >>> > Martin
>> >>> >
>> >>> > ________________________________
>> >>> > From: David Boxenhorn [mailto:david@lookin2.com]
>> >>> > Sent: Wednesday, June 02, 2010 2:30 PM
>> >>> > To: user@cassandra.apache.org
>> >>> > Subject: Re: Range search on keys not working?
>> >>> >
>> >>> > In other words, I should check the values as I iterate, and stop
>> >>> > iterating
>> >>> > when I get out of range?
>> >>> >
>> >>> > I'll try that!
>> >>> >
>> >>> > On Wed, Jun 2, 2010 at 3:15 PM, Dr. Martin Grabmüller
>> >>> > <Martin.Grabmueller@eleven.de> wrote:
>> >>> >>
>> >>> >> When not using OOP, you should not use something like 'CATEGORY/'
>> as
>> >>> >> the
>> >>> >> end key.
>> >>> >> Use the empty string as the end key and limit the number of
>> returned
>> >>> >> keys,
>> >>> >> as you did with
>> >>> >> the 'max' value.
>> >>> >>
>> >>> >> If I understand correctly, the end key is used to generate
an end
>> token
>> >>> >> by
>> >>> >> hashing it, and
>> >>> >> there is not the same correspondence between 'CATEGORY' and
>> 'CATEGORY/'
>> >>> >> as
>> >>> >> for
>> >>> >> hash('CATEGORY') and hash('CATEGORY/').
>> >>> >>
>> >>> >> At least, this was the explanation I gave myself when I had
the
>> same
>> >>> >> problem.
>> >>> >>
>> >>> >> The solution is to iterate through the keys by always using
the
>> last
>> >>> >> key
>> >>> >> returned as the
>> >>> >> start key for the next call to get_range_slices, and the to
drop
>> the
>> >>> >> first
>> >>> >> element from
>> >>> >> the result.
>> >>> >>
>> >>> >> HTH,
>> >>> >>   Martin
>> >>> >>
>> >>> >> ________________________________
>> >>> >> From: David Boxenhorn [mailto:david@lookin2.com]
>> >>> >> Sent: Wednesday, June 02, 2010 2:01 PM
>> >>> >> To: user@cassandra.apache.org
>> >>> >> Subject: Re: Range search on keys not working?
>> >>> >>
>> >>> >> The previous thread where we discussed this is called, "key
is
>> sorted?"
>> >>> >>
>> >>> >>
>> >>> >> On Wed, Jun 2, 2010 at 2:56 PM, David Boxenhorn <david@lookin2.com
>> >
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> I'm not using OPP. But I was assured on earlier threads
(I asked
>> >>> >>> several
>> >>> >>> times to be sure) that it would work as stated below: the
results
>> >>> >>> would not
>> >>> >>> be ordered, but they would be correct.
>> >>> >>>
>> >>> >>> On Wed, Jun 2, 2010 at 2:51 PM, Torsten Curdt <tcurdt@vafer.org>
>> >>> >>> wrote:
>> >>> >>>>
>> >>> >>>> Sounds like you are not using an order preserving partitioner?
>> >>> >>>>
>> >>> >>>> On Wed, Jun 2, 2010 at 13:48, David Boxenhorn <david@lookin2.com
>> >
>> >>> >>>> wrote:
>> >>> >>>> > Range search on keys is not working for me. I
was assured in
>> >>> >>>> > earlier
>> >>> >>>> > threads
>> >>> >>>> > that range search would work, but the results
would not be
>> ordered.
>> >>> >>>> >
>> >>> >>>> > I'm trying to get all the rows that start with
"CATEGORY."
>> >>> >>>> >
>> >>> >>>> > I'm doing:
>> >>> >>>> >
>> >>> >>>> > String start = "CATEGORY.";
>> >>> >>>> > .
>> >>> >>>> > .
>> >>> >>>> > .
>> >>> >>>> > keyspace.getSuperRangeSlice(columnParent, slicePredicate,
>> start,
>> >>> >>>> > "CATEGORY/", max)
>> >>> >>>> > .
>> >>> >>>> > .
>> >>> >>>> > .
>> >>> >>>> >
>> >>> >>>> > in a loop, setting start to the last key each
time - but I'm
>> >>> >>>> > getting
>> >>> >>>> > rows
>> >>> >>>> > that don't start with "CATEGORY."!!
>> >>> >>>> >
>> >>> >>>> > How do I get all rows that start with "CATEGORY."?
>> >>> >>>
>> >>> >>
>> >>> >
>> >>> >
>> >>
>> >>
>> > __________ Information from ESET NOD32 Antivirus, version of virus
>> signature database 5164 (20100601) __________
>> > The message was checked by ESET NOD32 Antivirus.
>> > http://www.eset.com
>>
>
>
>

Mime
View raw message