incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Staubo <madevilgen...@gmail.com>
Subject Re: proposal: rename <table> to <namespace>
Date Mon, 22 Jun 2009 16:32:31 GMT
On Mon, Jun 22, 2009 at 4:43 PM, Jonathan Ellis<jbellis@gmail.com> wrote:
> On Mon, Jun 22, 2009 at 9:35 AM, Matt Revelle<mrevelle@gmail.com> wrote:
>> Cassandra only supports one table per instance (before today?).  However, as
>> Jonathan mentioned previously, what you consider to be tables are
>> represented as column families in Cassandra.
>
> See, this is why I suggested renaming <table> -- _everyone_ who hasn't
> read the Bigtable paper (which is increasingly approaching 100% of new
> users) makes this mistake.

I have read the Bigtable paper, and I'm not making any mistake. The
paper consistently refers to its structures as "tables". In fact, it
occasionally uses the term "bigtable" to refer to such tables; quoth
page one: "A Bigtable is a sparse, distributed, persistent
multidimensional sorted map."

> Namespace just implies "keys in one namespace/table won't conflict
> with keys in another" which is exactly right.  (And maybe it's a
> slight abuse of the term to add options like key ordering there but
> the more important Big Picture connotation is correct.)

It's precisely this abuse confirms that the name "namespace" is inappropriate.

To explicate on my earlier argument, here are a few other things that
I consider to be part of a table/namespace/whatchamacallit:

* Column families. Each table have different, differently-named families.
* Replication factor. I should be able to define different replication
factors for different tables.
* Key ordering. Ditto.
* Data locations. In a server environment I might want table A to be
on /dev/sda, but table B on /dev/sdb for performance reasons.
* Storage mechanisms. Maybe one day Cassandra will support pluggable storages.
* Indexing and query semantics. At the moment Cassandra has limited
native indexing for performing SQL-like queries, but future support is
probably not out of the question?

All of these features are aspects of a kind of logical-physical
separation that deserves a better name than "namespace". I like
"table" because it fits; Cassandra is, after all, a sort of abstract
multidimensional hash table. The fact that Cassandra isn't *just* a
"table", but in fact a table of tables (of tables, potentially) does
not alter this fact.

A namespace is a space of names. It's a space where names live.
Nothing other than names live there. The sentence, "then we store the
key in the namespace" makes no sense. The thing we're talking about is
not primarily about names but about storing information. In that sense
you could call it a database, a container, a repository -- or a table,
since that's pretty much what it is. It's a table which incidentally
provides a namespace.

A.

Mime
View raw message