cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Koziarski <>
Subject Re: Visual representation of Cassandra data model
Date Thu, 13 Aug 2009 05:21:18 GMT
On Thu, Aug 13, 2009 at 5:12 PM, Arin Sarkissian<> wrote:
> FWIW: I find that the only sane way to visually represent a data model
> is to use a JSON-ish notation.
> Picture type visualizations confuse me even more.
> I don't mean to be a downer but me and a lot of my peers found all the
> picture type visual aides even more confusing

I agree, it's generally easier and pretty much everyone understands
jsonish notation (though I find ruby's => notation for hashes is
easier to follow ;))

Having said that, evan's pictures were really useful:

> -arin
> aka: phatduckk
> On Wed, Aug 12, 2009 at 8:35 PM, Jonathan Ellis<> wrote:
>> Thanks for taking a stab at this, Mark.
>> I'm not a fan of teaching this by showing CF-spanning rows.  (The
>> bigtable paper does this IIRC but it's wrong. :)
>> You can have data in different CFs with the same key, yes, but all
>> that means is they will be stored on the same nodes.  Each CF is
>> stored separately on disk and queried separately and the common case
>> is that they _won't_ have keys in common, rather than the reverse.
>> -Jonathan
>> On Wed, Aug 12, 2009 at 10:24 PM, Mark McBride<> wrote:
>>> Is this clearer?  I had the key names set up as <type>:<id> just
>>> keep it simple and put everything in one keyspace.  Ditto the super
>>> column, although I guess that could be spread out into three things,
>>> or you could spread it out into three keyspaces.  Not sure what best
>>> practices there are.
>>> What I'd like to do (and I'll get started on this tonight) is start
>>> with a problem statement, and then go about building up a
>>> storage-conf.xml file with this structure, showing API examples along
>>> the way.  So while this is a final picture, there would be simpler
>>> ones up front.
>>>   ---Mark
>>> On Wed, Aug 12, 2009 at 5:35 PM, Ryan King<> wrote:
>>>> A few quick comments:
>>>> * its not clear what column family the super column you're using is in.
>>>> * it might be useful to include the timestamps in the columns (since
>>>> they're user-supplied)
>>>> * given that the colon-delimited api has been removed, it might be
>>>> easier to explain the data model without such strings
>>>> * why would you mix different kinds of data in the same column family,
>>>> rather than having separate column families for each? (users,
>>>> bookmarks, tags)
>>>> -ryan
>>>> On Wed, Aug 12, 2009 at 4:57 PM, Mark McBride<>
>>>>> While working on an updated data model wiki page I'm trying to put
>>>>> together a graphical representation of the data model.  I threw this
>>>>> together based on Curt's goal of modeling delicious.  The basic gist
>>>>> is descriptive data for tags, users, and bookmarks goes in the
>>>>> Description column family.  The relationships between bookmarks, tags
>>>>> and users goes in the map supercolumn.  I'm not sure this is how you
>>>>> would do it in production (I'm guessing at the very least you'd want
>>>>> separate supercolumns for bookmarks, tags and users), but it seems to
>>>>> be simple enough for a new user to digest, and covers all the bases of
>>>>> the data model (aside from ordering I guess).  So two questions
>>>>> 1) did I get it right (I'm new to this as well)?
>>>>> 2) is this a useful representation?
>>>>>  ---Mark



View raw message