cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: keyspace with hundreds of columnfamilies
Date Sun, 13 Jul 2014 12:48:21 GMT
If your 1K tables might grow to  5 or 10K, then doesn’t that mean you would be trying to
add columns, later, after you’ve populated your data? If so, that would argue for using
one or more map columns, to accommodate the dynamic addition of pseudo-columns.

Once again, look at your queries (as they would be today and as in the future as you expand
the data) since they will be your ultimate guide as to how to model your data.

And drill deeper into how you will be inserting and updating the data in “groups” –
that will guide the data modeling as well. What will the typical update use cases look like?

By all means, start simple, but also be careful not to paint yourself into a corner. In the
alternative, be prepared to throw away entire implementations as your conceptualization of
the data evolves.

-- Jack Krupansky

From: tommaso barbugli 
Sent: Saturday, July 12, 2014 3:12 PM
To: user@cassandra.apache.org 
Subject: Re: keyspace with hundreds of columnfamilies

hi Jack 
thank you for your clear answer!

On Saturday, 12 July 2014, Jack Krupansky <jack@basetechnology.com> wrote:

  1. What does your data look like – 100 small integers or short strings and dates, or...
100 massive blobs?

it will be only small short strings/varints no blobs or nested data



  2. What operations are you doing on those rows – reading and updating individual columns,
or mostly full-row upserts?

mostly read write grops of columns (previously i had those set of columns in different CFs)


  3. 100 columns in a CQL row is not so unreasonable, per se.

  4. The ultimate answer to any “how will it perform” question is to do a “proof of
concept” implementation since it really all depends on your actual data and hardware setup,
such as memory, cpu, I/O, and networking – IOW, all the non-Cassandra factors can easily
dwarf Cassandra itself.

  5. As far as 1K tables with 10 columns vs. 100 tables with 100 columns – it should primarily
be your queries (and updates) that drive the decision. Do fewer tables and more columns make
your queries (and updates) a lot simpler and cleaner?

yes code-wise it does; i am just scared that i will get into some bad situation problem when
1k CFs will grow to 5 or 10k


  -- Jack Krupansky

  From: tommaso barbugli 
  Sent: Saturday, July 12, 2014 7:58 AM
  To: javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org'); 
  Subject: Re: keyspace with hundreds of columnfamilies

  hi, 
  how is a table with hundreds columns is going to perform? 

  i am moving from 1k column families each with 10 columns to 100 CFs each with 100 columns.

  thank you
  tommaso

  On Friday, 11 July 2014, Sourabh Agrawal <javascript:_e(%7B%7D,'cvml','iitr.sourabh@gmail.com');>
wrote:

    Yes, what about CQL style columns? Please clarify



    On Sat, Jul 5, 2014 at 12:32 PM, tommaso barbugli <javascript:_e(%7B%7D,'cvml','tbarbugli@gmail.com');>
wrote:

      Yes my question what about CQL-style columns.



      2014-07-04 12:40 GMT+02:00 Jens Rantil <javascript:_e(%7B%7D,'cvml','jens.rantil@tink.se');>:



        Just so you guys aren't misunderstanding each other; Tommaso, you were not refering
to CQL-style columns, right? 

        /J



        On Fri, Jul 4, 2014 at 10:18 AM, Romain HARDOUIN <javascript:_e(%7B%7D,'cvml','romain.hardouin@urssaf.fr');>
wrote:

          Cassandra can handle many more columns (e.g. time series). 
          So 100 columns is OK. 

          Best, 
          Romain 



          tommaso barbugli <javascript:_e(%7B%7D,'cvml','tbarbugli@gmail.com');> a écrit
sur 03/07/2014 21:55:18 :

          > De : tommaso barbugli <javascript:_e(%7B%7D,'cvml','tbarbugli@gmail.com');>

          > A : javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');, 
          > Date : 03/07/2014 21:55 

          > Objet : Re: keyspace with hundreds of columnfamilies 

          > 

          > thank you for the replies; I am rethinking the schema design, one 
          > possible solution is to "implode" one dimension and get N times less CFs.


          > With this approach I would come up with (cql) tables with up to 100 
          > columns; would that be a problem? 
          > 
          > Thank You, 
          > Tommaso 
          > 







    -- 

    Sourabh Agrawal 
    Bangalore
    +91 9945657973


  -- 
  sent from iphone (sorry for the typos)



-- 
sent from iphone (sorry for the typos)

Mime
View raw message