cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chetan verma <chetanverm...@gmail.com>
Subject Re: Dynamic Columns
Date Tue, 20 Jan 2015 19:36:56 GMT
Hi,

Most of the time I will  be querying on product_id and created_at, but for
analytic I need to query almost on all column.
Multiple collections ideas is good but the only is cassandra reads a
collection entirely, what if I need a slice of it, I mean
columns for certain keys which is possible with thrift. Please suggest.

On Wed, Jan 21, 2015 at 12:36 AM, Jonathan Lacefield <
jlacefield@datastax.com> wrote:

> Hello,
>
> There are probably lots of options to this challenge.  The more details
> around your use case that you can provide, the easier it will be for this
> group to offer advice.
>
> A few follow-up questions:
>   - How will you query this data?
>   - Do your queries require filtering on specific columns other than
> product_id and created_at, i.e. the dynamic columns?
>
> Depending on the answers to these questions, you have several options, of
> which here are a few:
>
>    - Cassandra efficiently stores sparse data, so you could create
>    columns and not populate them, without much of a penalty
>    - Could use a clustering column to store a columns type and another
>    col (potentially clustering) to store the value
>       - i.e. CREATE TABLE foo (col1 int, attname text, attvalue text,
>       col4...n, PRIMARY KEY (col1, attname, attvalue));
>       - where attname stores the name of the attribute/column and
>       attvalue stores the value of that attribute
>       - have seen users use this model and create a "main" attribute row
>       within a partition that stores the values associated with col4...n
>    - Could store multiple collections
>    - Others probably have ideas as well
>
> You may want to look in the archives for a similar discussion topic.
> Believe this item was asked a few months ago as well.
>
> [image: datastax_logo.png]
>
> Jonathan Lacefield
>
> Solution Architect | (404) 822 3487 | jlacefield@datastax.com
>
> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/>
>
> On Tue, Jan 20, 2015 at 1:40 PM, chetan verma <chetanverma82@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am creating a review system. for instance lets assume following are the
>> attibutes of system:
>>
>> Review{
>> id bigint,
>> product_id bigint,
>> created_at timestamp,
>> summary text,
>> description text,
>> pros set<text>,
>> cons set<text>,
>> feature_rating map<text, int>
>> etc....
>> }
>> I created partition key as product_id (so that all the reviews for a
>> given product will reside on same node)
>> and clustering key as created_at and id (Desc) so that  reviews will be
>> sorted by time.
>>
>> I can have more column and that requirement I want to fulfil by dynamic
>> columns but there are limitations to it explained above.
>> Could you please let me know the best way.
>>
>> On Tue, Jan 20, 2015 at 11:59 PM, Jonathan Lacefield <
>> jlacefield@datastax.com> wrote:
>>
>>> Hello,
>>>
>>>   Have you looked at solving this challenge with clustering columns?
>>> Also, please describe the problem set details for more specific advice from
>>> this group.
>>>
>>>   Starting new projects on Thrift isn't the recommended approach.
>>>
>>> Jonathan
>>>
>>> [image: datastax_logo.png]
>>>
>>> Jonathan Lacefield
>>>
>>> Solution Architect | (404) 822 3487 | jlacefield@datastax.com
>>>
>>> [image: linkedin.png] <http://www.linkedin.com/in/jlacefield/> [image:
>>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>>> <https://twitter.com/datastax> [image: g+.png]
>>> <https://plus.google.com/+Datastax/about>
>>> <http://feeds.feedburner.com/datastax> <https://github.com/datastax/>
>>>
>>> On Tue, Jan 20, 2015 at 1:24 PM, chetan verma <chetanverma82@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am starting a new project with cassandra as database.
>>>> I have unstructured data so I need dynamic columns,
>>>> though in CQL3 we can achive this via Collections but there are some
>>>> downsides to it.
>>>> 1. Collections are used to store small amount of data.
>>>> 2. The maximum size of an item in a collection is 64K.
>>>> 3. Cassandra reads a collection in its entirety.
>>>> 4. Restrictions on number of items in collections is 64,000
>>>>
>>>> And no support to get single column by map key, which is possible via
>>>> cassandra cli.
>>>> Please suggest whether I should use CQL3 or Thrift and which driver is
>>>> best.
>>>>
>>>> --
>>>> *Regards,*
>>>> *Chetan Verma*
>>>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>>>
>>>
>>>
>>
>>
>> --
>> *Regards,*
>> *Chetan Verma*
>> *+91 99860 86634 <%2B91%2099860%2086634>*
>>
>
>


-- 
*Regards,*
*Chetan Verma*
*+91 99860 86634*

Mime
View raw message