cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Burgess <andy.burg...@rbsworldpay.com>
Subject Re: Schema Question
Date Tue, 25 Jan 2011 10:37:20 GMT
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#ffffff" text="#000000">
    Aaron,<br>
    <br>
    A question about one of your general points, "do not create CF's on
    the fly" - what, exactly, does this mean? Do you mean named column
    families, like "BlogEntries" from Sam's example, or do you mean
    column family keys, like "i-got-a-new-guitar"? If it's the latter,
    then could you please explain why not to do this? My application is
    based around creating row keys on the fly, so I'd like to know ahead
    of time if I'm creating potential trouble for myself.<br>
    <br>
    To be honest, if you do mean specifically column families and not
    column family keys, then I don't even understand how you would go
    about creating those on-the-fly anyway. Don't they have to be
    pre-configured in storage-conf.xml?<br>
    <br>
    Thanks,<br>
    Andy.<br>
    <br>
    On 25/01/11 00:39, Aaron Morton wrote:
    <blockquote cite="mid:5e63430a-7297-61dd-889c-af54ac95a62a@me.com"
      type="cite">
      <div>Sam,&nbsp;</div>
      <div>The best advice is to jump in and try any schema If you are
        just starting out, start simple you're going to re-write it
        several times. Worry about scale later, in most cases it's going
        to work.&nbsp;</div>
      <div><br>
      </div>
      <div>Some general points:</div>
      <div><br>
      </div>
      <div>- do not create CF's on the fly.&nbsp;</div>
      <div>- work out your common read requests and denormalise to
        support these, the writes will be fast enough.&nbsp;</div>
      <div>- try to get each read request to be resolved by reading from
        a single CF (not a rule, just a guideline)</div>
      <div>- avoid big super columns.&nbsp;</div>
      <div>- this may also be interesting&nbsp;<a moz-do-not-send="true"
href="http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/"
_mce_href="http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/">http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/</a></div>
      <div><br>
      </div>
      <div>
        <meta http-equiv="content-type" content="text/html;
          charset=ISO-8859-1">
      </div>
      <div>If you are happy with the one in the article start with that
        and see how it works with you app. See how it works for your
        read activities.&nbsp;</div>
      <div><br>
      </div>
      <div>Hope that helps.&nbsp;</div>
      <div>Aaron</div>
      <div><br>
      </div>
      <div><br>
        On 25 Jan, 2011,at 12:47 PM, Sam Hodgson
        <a class="moz-txt-link-rfc2396E" href="mailto:hodgson_sam@hotmail.com">&lt;hodgson_sam@hotmail.com&gt;</a>
wrote:<br>
        <br>
      </div>
      <div>
        <blockquote type="cite">
          <div class="msg-quote">
            Hi all,
            <br>
            <br>
            Im brand new to Cassandra - im migrating from MySql for a
            large forum site and would be grateful if anyone can give me
            some basic pointers on schema design, or any recommended
            documentation.&nbsp; <br>
            <br>
            The example used in
            <a class="moz-txt-link-freetext" href="http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model">http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model</a>
            is very close if not exactly what I need for my main CF:<font
              style="font-size: 8pt;" _mce_style="font-size: 8pt;"
              size="1"><br>
            </font>
            <pre><code><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">&lt;!--</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 ColumnFamily: BlogEntries</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 This is where all the blog entries will go:</font><font style="font-size: 8pt;"
_mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 Row Key +&gt; post's slug (the seo friendly portion of the uri)</font><font
style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 Column Name: an attribute for the entry (title, body, etc)</font><font style="font-size:
8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 Column Value: value of the associated attribute</font><font style="font-size: 8pt;"
_mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 Access: grab an entry by slug (always fetch all Columns for Row)</font><font style="font-size:
8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 fyi: tags is a denormalization... its a comma separated list of tags.</font><font
style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 im not using json in order to not interfere with our</font><font style="font-size:
8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 notation but obviously you could use anything as long as your app</font><font style="font-size:
8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 knows how to deal w/ it</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 BlogEntries : { // CF</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
     i-got-a-new-guitar : { // row key - the unique "slug" of the entry.</font><font
style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         title: This is a blog entry about my new, awesome guitar,</font><font style="font-size:
8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         body: this is a cool entry. etc etc yada yada</font><font style="font-size:
8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         author: Arin Sarkissian  // a row key into the Authors CF</font><font style="font-size:
8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         tags: life,guitar,music  // comma sep list of tags (basic denormalization)</font><font
style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         pubDate: 1250558004      // unixtime for publish date</font><font style="font-size:
8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         slug: i-got-a-new-guitar</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
     },</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
     // all other entries</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
     another-cool-guitar : {</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         ...</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         tags: guitar,</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         slug: another-cool-guitar</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
     },</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
     scream-is-the-best-movie-ever : {</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         ..</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         tags: movie,horror,</font><font style="font-size: 8pt;" _mce_style="font-size:
8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
         slug: scream-is-the-best-movie-ever</font><font style="font-size: 8pt;"
_mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
     }</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">  
 }</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">--&gt;</font><font
style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">
</font><font style="font-size: 8pt;" _mce_style="font-size: 8pt;" size="1">&lt;ColumnFamily
CompareWith="BytesType" Name="BlogEntries"/&gt;</font>

</code>How well would this scale? Say you are storing 5 million posts and looking to
scale that up 
would it be better to segment them into several column families and if so to what extent?


I could create column families to store posts for each category however i'd end up with thousands
of CF's<code>.  
Saying that the data would then be stored in a very sorted manner for querying/presenting.

My db is very write heavy and growing fast, Cassandra sounds like the best solution.
</code><code>Any advice is greatly appreciated!! </code>

Thanks

Sam
</pre>
            <br>
            <mce:style id="_message-styles" type="text/css"><!--
msg-quote .hmmessage p {margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left:
0px; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; }
div.msg-quote .hmmessage {font-size: 10pt; font-family: Tahoma; }
--></mce:style><mce:style id="_message-styles" type="text/css"
              _mce_bogus="1"><!--
.msg-quote .hmmessage p {margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left:
0px; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; }
div.msg-quote .hmmessage {font-size: 10pt; font-family: Tahoma; }
--></mce:style>
            <style id="_message-styles" type="text/css" _mce_bogus="1"><!--
.msg-quote .hmmessage p {margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left:
0px; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; }
div.msg-quote .hmmessage {font-size: 10pt; font-family: Tahoma; }
--></style></div>
        </blockquote>
      </div>
    </blockquote>
    <br>
    <pre class="moz-signature" cols="72">-- 
Andy Burgess
Principal Development Engineer
Application Delivery
WorldPay Ltd.
270-289 Science Park, Milton Road
Cambridge, CB4 0WE, United Kingdom (Depot Code: 024)
Office: +44 (0)1223 706 779| Mobile: +44 (0)7909 534 940
<a class="moz-txt-link-abbreviated" href="mailto:andy.burgess@worldpay.com">andy.burgess@worldpay.com</a>
</pre>
  </body>
</html>

Mime
View raw message