cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Lin <wool...@gmail.com>
Subject Re: Dynamic Columns
Date Wed, 21 Jan 2015 01:41:02 GMT
I think that table example misses the point of chetan's functional
requirement. he actually needs dynamic columns.

On Tue, Jan 20, 2015 at 8:12 PM, Xu Zhongxing <xu_zhong_xing@163.com> wrote:

> Maybe this is the closest thing to "dynamic columns" in CQL 3.
>
> create table reivew (
>     product_id bigint,
>     created_at timestamp,
>     data_key text,
>     data_tvalue text,
>     data_ivalue int,
>     primary key ((priduct_id, created_at), data_key)
> );
>
> data_tvalue and data_ivalue is optional.
>
> At 2015-01-21 04:44:07, "chetan verma" <chetanverma82@gmail.com> wrote:
>
> Hi,
>
> Adding to previous mail. For example: We have a column family named review
> (with some arbitrary data in map).
>
> CREATE TABLE review(
> product_id bigint,
> created_at timestamp,
> data_int map<text, int>,
> data_text map<text, text>,
> PRIMARY KEY (product_id, created_at)
> );
>
> Assume that these 2 maps I use to store arbitrary data (i.e. data_int and
> data_text for int and text values)
> when we see output on cassandra-cli, it looks like in a partition as :
> <clustering_key>:data_int:map_key as column name and value as map value.
> suppose I need to get this value, I couldn't do that with CQL3 but in
> thrift its possible. Any Solution?
>
> On Wed, Jan 21, 2015 at 1:06 AM, chetan verma <chetanverma82@gmail.com>
> wrote:
>
>> Hi,
>>
>> Most of the time I will  be querying on product_id and created_at, but
>> for analytic I need to query almost on all column.
>> Multiple collections ideas is good but the only is cassandra reads a
>> collection entirely, what if I need a slice of it, I mean
>> columns for certain keys which is possible with thrift. Please suggest.
>>
>> On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield <
>> jlacefield@datastax.com> wrote:
>>
>>> Hello,
>>>
>>> There are probably lots of options to this challenge.  The more details
>>> around your use case that you can provide, the easier it will be for this
>>> group to offer advice.
>>>
>>> A few follow-up questions:
>>>   - How will you query this data?
>>>   - Do your queries require filtering on specific columns other than
>>> product_id and created_at, i.e. the dynamic columns?
>>>
>>> Depending on the answers to these questions, you have several options,
>>> of which here are a few:
>>>
>>>    - Cassandra efficiently stores sparse data, so you could create
>>>    columns and not populate them, without much of a penalty
>>>    - Could use a clustering column to store a columns type and another
>>>    col (potentially clustering) to store the value
>>>       - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text,
>>>       col4...n, PRIMARY KEY (col1, attname, attvalue));
>>>       - where attname stores the name of the attribute/column and
>>>       attvalue stores the value of that attribute
>>>       - have seen users use this model and create a "main" attribute
>>>       row within a partition that stores the values associated with col4...n
>>>    - Could store multiple collections
>>>    - Others probably have ideas as well
>>>
>>> You may want to look in the archives for a similar discussion topic.
>>> Believe this item was asked a few months ago as well.
>>>
>>> [image: datastax_logo.png]
>>>
>>> Jonathan Lacefield
>>>
>>> Solution Architect | (404) 822 3487 | jlacefield@datastax.com
>>>
>>> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image:
>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>> <https://twitter.com/datastax> [image: g+.png]
>>> <https://plus.google.com/+Datastax/about>
>>> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/>
>>>
>>> On Tue, Jan 20, 2015 at 1:40 PM, chetan verma <chetanverma82@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am creating a review system. for instance lets assume following are
>>>> the attibutes of system:
>>>>
>>>> Review{
>>>> id bigint,
>>>> product_id bigint,
>>>> created_at timestamp,
>>>> summary text,
>>>> description text,
>>>> pros set<text>,
>>>> cons set<text>,
>>>> feature_rating map<text, int>
>>>> etc....
>>>> }
>>>> I created partition key as product_id (so that all the reviews for a
>>>> given product will reside on same node)
>>>> and clustering key as created_at and id (Desc) so that  reviews will be
>>>> sorted by time.
>>>>
>>>> I can have more column and that requirement I want to fulfil by dynamic
>>>> columns but there are limitations to it explained above.
>>>> Could you please let me know the best way.
>>>>
>>>> On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield <
>>>> jlacefield@datastax.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>>   Have you looked at solving this challenge with clustering columns?
>>>>> Also, please describe the problem set details for more specific advice
from
>>>>> this group.
>>>>>
>>>>>   Starting new projects on Thrift isn't the recommended approach.
>>>>>
>>>>> Jonathan
>>>>>
>>>>> [image: datastax_logo.png]
>>>>>
>>>>> Jonathan Lacefield
>>>>>
>>>>> Solution Architect | (404) 822 3487 | jlacefield@datastax.com
>>>>>
>>>>> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/>
[image:
>>>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>>>> <https://twitter.com/datastax> [image: g+.png]
>>>>> <https://plus.google.com/+Datastax/about>
>>>>> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/>
>>>>>
>>>>> On Tue, Jan 20, 2015 at 1:24 PM, chetan verma <chetanverma82@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am starting a new project with cassandra as database.
>>>>>> I have unstructured data so I need dynamic columns,
>>>>>> though in CQL3 we can achive this via Collections but there are some
>>>>>> downsides to it.
>>>>>> 1. Collections are used to store small amount of data.
>>>>>> 2. The maximum size of an item in a collection is 64K.
>>>>>> 3. Cassandra reads a collection in its entirety.
>>>>>> 4. Restrictions on number of items in collections is 64,000
>>>>>>
>>>>>> And no support to get single column by map key, which is possible
via
>>>>>> cassandra cli.
>>>>>> Please suggest whether I should use CQL3 or Thrift and which driver
>>>>>> is best.
>>>>>>
>>>>>> --
>>>>>> *Regards,*
>>>>>> *Chetan Verma*
>>>>>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Regards,*
>>>> *Chetan Verma*
>>>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>>>
>>>
>>>
>>
>>
>> --
>> *Regards,*
>> *Chetan Verma*
>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>
>
>
>
> --
> *Regards,*
> *Chetan Verma*
> *+91 99860 86634 <%2B91%2099860%2086634>*
>
>

Mime
View raw message