cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tuukka Mustonen <tuukka.musto...@gmail.com>
Subject Re: Storing values of mixed types in a list
Date Wed, 25 Jun 2014 11:06:00 GMT
Actually, come to think of it, of course I cannot run greater/less than
queries on list items anyway (would be something like "WHERE items CONTAINS
> 4"), so binary encoding should be fine. Thanks for everybody's input!

Tuukka


On Wed, Jun 25, 2014 at 1:49 PM, Tuukka Mustonen <tuukka.mustonen@gmail.com>
wrote:

> Sorry for confusion, I should have lined my requirements better in the
> first place. Let me try to summarize:
>
> - I can use list<blob> and query against it using secondary indexes and by
> encoding my data on the client side. However, *this only allows exact
> matches, not greater/lesser than *for numbers at least (not sure I need
> to, but maybe). Please correct me if I got it wrong? I'm not very familiar
> with playing with binary.
> - My supported list of types is very limited, indeed, and the order
> doesn't matter, so I could use separate list for each type. However, that
> makes playing with data somewhat cumbersome and I need to have multiple
> clauses in queries then, for each type.
> - I could use user defined types, but I would still have to define
> separate field for each value and queries would again be cumbersome.
>
> Let's forget about "dynamic schema" as I'm a Cassandra newbie and
> definitively need to study more before opening that chest of wonders.
> Thanks for correcting me.
>
> I just wish there was an easy way to define a list as list<?> and to run
> queries against. But, sounds like there isn't (and nobody is seeing need
> for it) so I think I'll just take one of the suggested workarounds...
>
> Tuukka
>
>
>
> On Wed, Jun 25, 2014 at 10:47 AM, Sylvain Lebresne <sylvain@datastax.com>
> wrote:
>
>> On Wed, Jun 25, 2014 at 8:49 AM, Tuukka Mustonen <
>> tuukka.mustonen@gmail.com> wrote:
>>
>>> Unfortunately, I need to query per list items. That's why I'm running
>>> Cassandra 2.1rc1 (offers secondary indexes for collections).
>>>
>>
>> Using a list of blobs does not in any way prevent you from doing that.
>> Types are constraints on what values C* will accept and using blob is
>> simply asking C* to not reject any value. Doing so does not in any way
>> limit the kind of queries you can do.
>>
>> The small downside of using blobs is that you'll have to
>> serialize/deserialize your value manually client-side, but that's not a
>> huge deal either. That said, if you really only have 3 types of values to
>> store and if you don't particularly care about the order of items in the
>> collection (i.e. if you said you want a list but could really do with a
>> set), then storing 3 different sets can be a viable solution too (as in,
>> there is no strong downside to doing it as far as C* is concerned and it
>> may be simpler to deal with client side (or not, it depends a bit on what
>> your client side code does exactly)).
>>
>>
>>>
>>> As I understood it, also Cassandra supports dynamic schemas, but only
>>> through Thrift protocol.
>>>
>>
>> "dynamic schemas" is a terribly imprecise term that means different
>> things to different people, but in general that statement is incorrect: you
>> can do the same things with CQL and with Thrift.
>>
>>
>>> Also, I don't think it changes the fact that collections need to be
>>> strongly-typed in Cassandra, no matter what protocol is used?
>>>
>>
>> Well, yes since you do have to provide a type for the elements in the
>> collection, but as said previously that does not in any way prevent you for
>> having "collections of anything" since you can use a blob type.
>>
>> --
>> Sylvain
>>
>>
>>>
>>> Tuukka
>>>
>>>
>>>
>>> On Tue, Jun 24, 2014 at 9:41 PM, DuyHai Doan <doanduyhai@gmail.com>
>>> wrote:
>>>
>>>> "Jeremy, with blob field (ByteBuffer), I can query exact matches (just
>>>> encode the value in query), but greater/less than queries would not work.
>>>> Any sort of serialization kills "native" ways to query data" --> Not
>>>> necessarily. You still use "normal" types (uuid, string, timestamp,...) for
>>>> clustering columns and use them for querying. For the cells where you store
>>>> values, use blob type.
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jun 24, 2014 at 8:21 PM, Tuukka Mustonen <
>>>> tuukka.mustonen@gmail.com> wrote:
>>>>
>>>>> What if I need to query by list items?
>>>>>
>>>>> 1. Jeremy, with blob field (ByteBuffer), I can query exact matches
>>>>> (just encode the value in query), but greater/less than queries would
not
>>>>> work. Any sort of serialization kills "native" ways to query data
>>>>> 2. Even with user defined types, I would need to define separate
>>>>> fields for each value. Running queries would be cumbersome (something
like
>>>>> WHERE items CONTAINS {'text_value': 'foobar'} or WHERE items CONTAINS
>>>>> {'int_value': 3}. Pavel, did you mean like this?
>>>>>
>>>>> I'm running 2.1rc1 with python driver 2.0.2.
>>>>>
>>>>> Tuukka
>>>>>
>>>>>
>>>>> On Tue, Jun 24, 2014 at 4:39 PM, Pavel Kogan <pavel.kogan@cortica.com>
>>>>> wrote:
>>>>>
>>>>>> 1) You can use list of strings which are serialized JSONs, or use
>>>>>> ByteBuffer with your own serialization as Jeremy suggested.
>>>>>> 2) Use Cassandra 2.1 (not officially released yet) were there is
new
>>>>>> feature of user defined types.
>>>>>>
>>>>>> Pavel
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 24, 2014 at 9:18 AM, Jeremy Jongsma <jeremy@barchart.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Use a ByteBuffer value type with your own serialization (we use
>>>>>>> protobuf for complex value structures)
>>>>>>>  On Jun 24, 2014 5:30 AM, "Tuukka Mustonen" <
>>>>>>> tuukka.mustonen@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I need to store a list of mixed types in Cassandra. The list
may
>>>>>>>> contain numbers, strings and booleans. So I would need something
like
>>>>>>>> list<?>.
>>>>>>>>
>>>>>>>> Is this possible in Cassandra and if not, what workaround
would you
>>>>>>>> suggest for storing a list of mixed type items? I sketched
a few (using a
>>>>>>>> list per type, using list of user types in Cassandra 2.1,
etc.), but I get
>>>>>>>> a bad feeling about each.
>>>>>>>>
>>>>>>>> Couldn't find an "exact" answer to this through searches...
>>>>>>>> Regards,
>>>>>>>> Tuukka
>>>>>>>>
>>>>>>>> P.S. I first asked this at SO before realizing the traffic
there is
>>>>>>>> very low:
>>>>>>>> http://stackoverflow.com/questions/24380158/storing-a-list-of-mixed-types-in-cassandra
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message