cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dorian Hoxha <dorian.ho...@gmail.com>
Subject Re: Maximum number of columns in a table
Date Thu, 15 Sep 2016 20:24:26 GMT
@DuyHai
I know they don't support.
I need key+value mapping, not just "values" or just "keys".

I'll use the lucene index.



On Thu, Sep 15, 2016 at 10:23 PM, DuyHai Doan <doanduyhai@gmail.com> wrote:

> I'd advise anyone against using the old native secondary index ... You'll
> get poor performance (that's the main reason why some people developed
> SASI).
>
> On Thu, Sep 15, 2016 at 10:20 PM, Hannu Kröger <hkroger@gmail.com> wrote:
>
>> Hi,
>>
>> The ‘old-fashioned’ secondary indexes do support index of collection
>> values:
>> https://docs.datastax.com/en/cql/3.1/cql/ddl/ddlIndexColl.html
>>
>> Br,
>> Hannu
>>
>> On 15 Sep 2016, at 15:59, DuyHai Doan <doanduyhai@gmail.com> wrote:
>>
>> "But the problem is I can't use secondary indexing "where int25=5", while
>> with normal columns I can."
>>
>> You have many objectives that contradict themselves in term of impl.
>>
>> Right now you're unlucky, SASI does not support indexing collections yet
>> (it may come in future, when ?  ¯\_(ツ)_/¯ )
>>
>> If you're using DSE Search or Stratio Lucene Index, you can index map
>> values
>>
>> On Thu, Sep 15, 2016 at 9:53 PM, Dorian Hoxha <dorian.hoxha@gmail.com>
>> wrote:
>>
>>> Yes that makes more sense. But the problem is I can't use secondary
>>> indexing "where int25=5", while with normal columns I can.
>>>
>>> On Thu, Sep 15, 2016 at 8:23 PM, sfescape@gmail.com <sfescape@gmail.com>
>>> wrote:
>>>
>>>> I agree a single blob would also work (I do that in some cases). The
>>>> reason for the map is if you need more flexible updating. I think your
>>>> solution of a map/data type works well.
>>>>
>>>> On Thu, Sep 15, 2016 at 11:10 AM DuyHai Doan <doanduyhai@gmail.com>
>>>> wrote:
>>>>
>>>>> "But I need rows together to work with them (indexing etc)"
>>>>>
>>>>> What do you mean rows together ? You mean that you want to fetch a
>>>>> single row instead of 1 row per property right ?
>>>>>
>>>>> In this case, the map might be the solution:
>>>>>
>>>>> CREATE TABLE generic_with_maps(
>>>>>    object_id uuid
>>>>>    boolean_map map<text, boolean>
>>>>>    text_map map<text, text>
>>>>>    long_map map<text, long>,
>>>>>    ...
>>>>>    PRIMARY KEY(object_id)
>>>>> );
>>>>>
>>>>> The trick here is to store all the fields of the object in different
>>>>> map, depending on the type of the field.
>>>>>
>>>>> The map key is always text and it contains the name of the field.
>>>>>
>>>>> Example
>>>>>
>>>>> {
>>>>>    "id": xxxx,
>>>>>     "name": "John DOE",
>>>>>     "age":  32,
>>>>>     "last_visited_date":  "2016-09-10 12:01:03",
>>>>> }
>>>>>
>>>>> INSERT INTO generic_with_maps(id, map_text, map_long, map_date)
>>>>> VALUES(xxx, {'name': 'John DOE'}, {'age': 32}, {'last_visited_date':
'2016-09-10
>>>>> 12:01:03'});
>>>>>
>>>>> When you do a select, you'll get a SINGLE row returned. But then you
>>>>> need to extract all the properties from different maps, not a big deal
>>>>>
>>>>> On Thu, Sep 15, 2016 at 7:54 PM, Dorian Hoxha <dorian.hoxha@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> @DuyHai
>>>>>> Yes, that's another case, the "entity" model used in rdbms. But I
>>>>>> need rows together to work with them (indexing etc).
>>>>>>
>>>>>> @sfespace
>>>>>> The map is needed when you have a dynamic schema. I don't have a
>>>>>> dynamic schema (may have, and will use the map if I do). I just have
>>>>>> thousands of schemas. One user needs 10 integers, while another user
needs
>>>>>> 20 booleans, and another needs 30 integers, or a combination of them
all.
>>>>>>
>>>>>> On Thu, Sep 15, 2016 at 7:46 PM, DuyHai Doan <doanduyhai@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> "Another possible alternative is to use a single map column"
>>>>>>>
>>>>>>> --> how do you manage the different types then ? Because maps
in
>>>>>>> Cassandra are strongly typed
>>>>>>>
>>>>>>> Unless you set the type of map value to blob, in this case you
might
>>>>>>> as well store all the object as a single blob column
>>>>>>>
>>>>>>> On Thu, Sep 15, 2016 at 6:13 PM, sfescape@gmail.com <
>>>>>>> sfescape@gmail.com> wrote:
>>>>>>>
>>>>>>>> Another possible alternative is to use a single map column.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Sep 15, 2016 at 7:19 AM Dorian Hoxha <
>>>>>>>> dorian.hoxha@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Since I will only have 1 table with that many columns,
and the
>>>>>>>>> other tables will be "normal" tables with max 30 columns,
and the memory of
>>>>>>>>> 2K columns won't be that big, I'm gonna guess I'll be
fine.
>>>>>>>>>
>>>>>>>>> The data model is too dynamic, the alternative would
be to create
>>>>>>>>> a table for each user which will have even more overhead
since the number
>>>>>>>>> of users is in the several thousands/millions.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Sep 15, 2016 at 3:04 PM, DuyHai Doan <doanduyhai@gmail.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> There is no real limit in term of number of columns
in a table, I
>>>>>>>>>> would say that the impact of having a lot of columns
is the amount of meta
>>>>>>>>>> data C* needs to keep in memory for encoding/decoding
each row.
>>>>>>>>>>
>>>>>>>>>> Now, if you have a table with 1000+ columns, the
problem is
>>>>>>>>>> probably your data model...
>>>>>>>>>>
>>>>>>>>>> On Thu, Sep 15, 2016 at 2:59 PM, Dorian Hoxha <
>>>>>>>>>> dorian.hoxha@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Is there alot of overhead with having a big number
of columns in
>>>>>>>>>>> a table ? Not unbounded, but say, would 2000
be a problem(I think that's
>>>>>>>>>>> the maximum I'll need) ?
>>>>>>>>>>>
>>>>>>>>>>> Thank You
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>>
>>
>

Mime
View raw message