cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McNelis <dmcne...@agentisenergy.com>
Subject Re: Schema Question
Date Tue, 25 Jan 2011 13:28:56 GMT
I'm fairly certain Aaron is referring to named families like BlogEntries,
not named columns (i-got-a-new-guitar).

On Tue, Jan 25, 2011 at 4:37 AM, Andy Burgess
<andy.burgess@rbsworldpay.com>wrote:

>  Aaron,
>
> A question about one of your general points, "do not create CF's on the
> fly" - what, exactly, does this mean? Do you mean named column families,
> like "BlogEntries" from Sam's example, or do you mean column family keys,
> like "i-got-a-new-guitar"? If it's the latter, then could you please explain
> why not to do this? My application is based around creating row keys on the
> fly, so I'd like to know ahead of time if I'm creating potential trouble for
> myself.
>
> To be honest, if you do mean specifically column families and not column
> family keys, then I don't even understand how you would go about creating
> those on-the-fly anyway. Don't they have to be pre-configured in
> storage-conf.xml?
>
> Thanks,
> Andy.
>
>
> On 25/01/11 00:39, Aaron Morton wrote:
>
> Sam,
> The best advice is to jump in and try any schema If you are just starting
> out, start simple you're going to re-write it several times. Worry about
> scale later, in most cases it's going to work.
>
>  Some general points:
>
>  - do not create CF's on the fly.
> - work out your common read requests and denormalise to support these, the
> writes will be fast enough.
> - try to get each read request to be resolved by reading from a single CF
> (not a rule, just a guideline)
> - avoid big super columns.
> - this may also be interesting
> http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/
>
>   If you are happy with the one in the article start with that and see how
> it works with you app. See how it works for your read activities.
>
>  Hope that helps.
> Aaron
>
>
> On 25 Jan, 2011,at 12:47 PM, Sam Hodgson <hodgson_sam@hotmail.com><hodgson_sam@hotmail.com>wrote:
>
>   Hi all,
>
> Im brand new to Cassandra - im migrating from MySql for a large forum site
> and would be grateful if anyone can give me some basic pointers on schema
> design, or any recommended documentation.
>
> The example used in
> http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model is very
> close if not exactly what I need for my main CF:
>
> <!--    ColumnFamily: BlogEntries    This is where all the blog entries will go: 
  Row Key +> post's slug (the seo friendly portion of the uri)    Column Name: an attribute
for the entry (title, body, etc)    Column Value: value of the associated attribute    Access:
grab an entry by slug (always fetch all Columns for Row)    fyi: tags is a denormalization...
its a comma separated list of tags.    im not using json in order to not interfere with our
   notation but obviously you could use anything as long as your app    knows how to deal
w/ it    BlogEntries : { // CF        i-got-a-new-guitar : { // row key - the unique "slug"
of the entry.            title: This is a blog entry about my new, awesome guitar,       
    body: this is a cool entry. etc etc yada yada            author: Arin Sarkissian  // a
row key into the Authors CF            tags: life,guitar,music  // comma sep list of tags
(basic denormalization)            pubDate: 1250558004      // unixtime for publish date 
          slug: i-got-a-new-guitar        },        // all other entries        another-cool-guitar
: {            ...            tags: guitar,            slug: another-cool-guitar        },
       scream-is-the-best-movie-ever : {            ..            tags: movie,horror,    
       slug: scream-is-the-best-movie-ever        }    }--><ColumnFamily CompareWith="BytesType"
Name="BlogEntries"/>
> How well would this scale? Say you are storing 5 million posts and looking to scale that
up
> would it be better to segment them into several column families and if so to what extent?
>
> I could create column families to store posts for each category however i'd end up with
thousands of CF's.
> Saying that the data would then be stored in a very sorted manner for querying/presenting.
>
> My db is very write heavy and growing fast, Cassandra sounds like the best solution.Any
advice is greatly appreciated!!
>
> Thanks
>
> Sam
>
>
>
> --
> Andy Burgess
> Principal Development Engineer
> Application Delivery
> WorldPay Ltd.
> 270-289 Science Park, Milton Road
> Cambridge, CB4 0WE, United Kingdom (Depot Code: 024)
> Office: +44 (0)1223 706 779| Mobile: +44 (0)7909 534 940andy.burgess@worldpay.com
>
>
> WorldPay (UK) Limited, Company No. 07316500. Registered Office: 55 Mansell
> Street, London E1 8AN
>
> Authorised and regulated by the Financial Services Authority.
>
> ‘WorldPay Group’ means WorldPay (UK) Limited and its affiliates from time
> to time.  A reference to an “affiliate” means any Subsidiary Undertaking,
> any Parent Undertaking and any Subsidiary Undertaking of any such Parent
> Undertaking and reference to a “Parent Undertaking” or a “Subsidiary
> Undertaking” is to be construed in accordance with section 1162 of the
> Companies Act 2006, as amended.
>
> DISCLAIMER: This email and any files transmitted with it, including replies
> and forwarded copies (which may contain alterations) subsequently
> transmitted from the WorldPay Group, are confidential and solely for the use
> of the intended recipient. If you are not the intended recipient (or
> authorised to receive for the intended recipient), you have received this
> email in error and any review, use, distribution or disclosure of its
> content is strictly prohibited. If you have received this email in error
> please notify the sender immediately by replying to this message. Please
> then delete this email and destroy any copies of it.
>
> Messages sent to and from the WorldPay Group may be monitored to ensure
> compliance with internal policies and to protect our business.  Emails are
> not necessarily secure.  The WorldPay Group does not accept responsibility
> for changes made to this message after it was sent. Please note that neither
> the WorldPay Group nor the sender accepts any responsibility for viruses and
> it is the responsibility of the recipient to ensure that the onward
> transmission, opening or use of this message and any attachments will not
> adversely affect its systems or data. Anyone who communicates with us by
> email is taken to accept these risks. Opinions, conclusions and other
> information contained in this message that do not relate to the official
> business of the WorldPay Group shall not be understood as endorsed or given
> by it.
>
>


-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*

Mime
View raw message