cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Per-Namespace / Per-Table Partitioner
Date Mon, 30 Mar 2009 19:26:04 GMT
The problem is you're using table to mean something that does not fit
into Cassandra.

Cassandra has ColumnFamilies but we do not need tables to group those
the way relational databases group columns.

To the degree that we need locality groups we have Super CFs.

Cassandra already has (the beginnings of) something called a Table
that is really what you are asking for in the first part, a namespace.

Probably we should rename that to Schema or Namespace to avoid confusion.

Have you read the BigTable paper yet?  A lot of it does not apply to
Cassandra but the ColumnFamily/Memtable/SSTable descriptions are worth
reading.  Sections 4, 5.3, 5.4 in particular.  From what I can tell
Cassandra's implementation was heavily inspired by this.

-Jonathan

On Mon, Mar 30, 2009 at 1:16 PM, Neophytos Demetriou
<neophytos@gmail.com> wrote:
> I am not sure yet if this is a good idea or not but we might want to
> consider partitions on a per-namespace basis, e.g. prefixing row keys
> with predefined namespaces. I understand that this might not be the
> preferred course of action for those of you using one cassandra cluster
> per application (SLAs, etc) but nothing prevents you from running a
> partitioner per cluster under this scheme.
>
> Here's how this could work:
>
> In storage-conf.xml:
>
> <partitioner ns="mynsp0">RandomPartitioner"</partitioner>
> <partitioner ns="mynsp1">OrderPreservingPartitioner"</paritioner>
>
> The row key would then be of the form "mynamespace:mykey" (namespace
> would be optional). If no namespace is provided then the default
> partitioner is used. In this way we could support ranges for those keys
> that their namespace's partitioner supports it. Of course, namespaces
> could also be used for other purposes.
>
> I have discussed this with a friend who encouraged me to post here. I
> hope I'm not way off with this one.
>
> An alternative is to get multiple tables working and implement per-table
> partitioner support. I think there is a conceptual distinction between
> the two in the sense that a table would most likely be a grouping of
> column families whereas namespaces could span different tables and, in
> some occasions/applications, one might want to utilize the same column
> families with different namespaces.
>
> Any feedback is greatly appreciated.
>
> - Neophytos
>

Mime
View raw message