incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evan Weaver <ewea...@gmail.com>
Subject Re: Fixing the data model names
Date Thu, 13 Aug 2009 01:34:29 GMT
Points taken, and I agree, except in my experience the current names
are not Pretty Good but rather Pretty Weird; the primary issues being
column family and super column.

If we go by the shorter-is-better principle, we might get:

Cluster
Schema
Row set
Row w/key
Field set
Field

"You take the user's key, and use that to insert into the Row Set
'user_associations' at Field Set 'user_timeline,' a field named with a
time-based UUID representing now, and with a value of the new tweet's
key."

But let me study for a while and come up with a more researched proposal.

Evan

On Wed, Aug 12, 2009 at 9:21 PM, Jonathan Ellis<jbellis@gmail.com> wrote:
> On Wed, Aug 12, 2009 at 7:52 PM, Michael Koziarski<michael@koziarski.com> wrote:
>> However I think it's worth considering this from a strategic
>> perspective, looking at how we want the project do grow and change,
>> rather than just as it is right now.  The key to successful adoption
>> is having a successful elevator pitch,  you can start using a database
>> without understanding relational-algebra because 'table' and 'column'
>> are such simple ways to reason about the tool.  As it stands
>> cassandra's takes a whiteboard and 15 minutes, before people get what
>> you're talking about.
>
> If you want to explain it as "sort of like a relational db" then
>
> table -> CF
> column -> column
> key -> key
> row -> row
>
> That's the simple case, then all you have is "supercolumns can contain
> a list of simple columns."
>
> That really doesn't seem so hard to me.  I have explained this to *managers*.
>
>> Assuming the project gets anything like the adoption it deserves, the
>> users we have today will be a *tiny minority* of the users we have in
>> the future.  So imposing costs on the current userbase which will give
>> huge benefits to future users, should be something we're willing to
>> do.  In fact it's something that has been done repeatedly over the
>> last few weeks.
>
> I agree.  But as I said before I just don't see this as being an improvement.
>
>> Given those changes went in without debate, I'm not sure what the
>> reluctance is for making changes to the nomenclature for the project.
>
> As above.
>
>> Speaking as someone who's only been doing this a month, the naming is
>> *still* confusing, and when I talk with people who wonder what
>> cassandra is all about I get blank looks when telling them what things
>> are called.  If you step back and want to tell someone how you'd
>> insert a tweet into someone's timeline using evan's weblog post:
>>
>>  "You just take the user's key, and use that to insert into the
>> SuperColumnFamily 'UserAssociations' at SubColumn 'user_timeline', a
>> ColumnName of a time based uuid representing now, and a value of the
>> new tweet's key"
>>
>> Column is in the name of 3 of the 5 concepts expressed, and in each
>> cases it's different.
>
> When you're inserting something nested 3 levels deep a certain amount
> of verbosity is unavoidable.  With Evan's nomenclature,
>
> "You take the user's record ID, and use that to insert into the Record
> Collection 'user associations' at Attribute Collection
> 'user_timeline,' an Attribute named with a time based uuid
> representing now, and with a value of the new tweet's key."
>
> I think that is a negative improvement.  Yay, now we are talking about
> Attribute Collections and Attributes instead of SuperColumns and
> Columns.  The same objections ("one object's name contains the
> other's!) apply, plus the new one of sounding so generic that it could
> apply to practically any system.
>
> -Jonathan
>



-- 
Evan Weaver

Mime
View raw message