incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill de hOra <b...@dehora.net>
Subject Re: Fixing the data model names
Date Tue, 11 Aug 2009 21:48:25 GMT


Evan Weaver wrote:
> So technically this is not a bikeshed, because I'm happy to do all the
> work. I'll even submit a patch for Digg's Python client. Since there
> are no production deployments of ASF, and only a couple
> well-maintained clients, now is the time to break the world. A few
> hours of work now will pay off richly in terms of community
> involvement and reduced noob-explanation-time.

Post-keyspace we have this situation

1: objects with table in their name:

   http://www.flickr.com/photos/dehora/3812812718/sizes/l/

2: objects with keyspace in their name

   http://www.flickr.com/photos/dehora/3812812498/sizes/l/

What I take from this either is the code is dissonant with the current 
consensus or current consensus is an hallucination :)

So if we go through this, the community needs to commit to renaming 
objects and clearing out dead concepts. IME patch based processes resist 
this kind of high level rework unless the community and especially the 
reviewers are up for it.


> In short:
> 
>   Cluster
>   Database
>   Record collection
>   Record
>   Attribute collection
>   Attribute
> 
> We could call the cluster "database collection", but even I'm not
> going to go that far. I realize that each level is merely a collection
> of the collections under it, but an "attribute collection collection
> collection collection" is no help to day-to-day usage. ;-)

"Cluster" has a lot of meaning in the Java world already (a collection 
of app servers) and is tied to the physical model - all the others are 
tied to the logical model of the data.

Putting "Database" underneath "Cluster" misses the point that the 
database is distributed across the cluster - - even it it's not right 
for Cassandra, "BigTable" captures this concept well. For me, that the 
database remains the uppermost concept even after physical distribution 
is largely the point of Cassandra.


 > A few
 > hours of work now will pay off richly in terms of community
 > involvement and reduced noob-explanation-time.


For usage and the API, there are other concepts that need to be 
articulated properly for Cassandra users, such as slice, reverse, range, 
  mutation, consistency, path, parent. I'd like to believe these matter 
in the domain and are not fallout from using thrift/rpc ;)

Bill

Mime
View raw message