incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandeep Tata <sandeep.t...@gmail.com>
Subject Re: Per-Namespace / Per-Table Partitioner
Date Wed, 01 Apr 2009 01:58:30 GMT
I agree with Alexander.

The partitioner per-namespace, while useful for some apps, really ends
up looking like a quick and dirty hack for multiple tables.
You could achieve all of what Neophytos described in his example by
sticking the logic in the partitioner class if we eventually allowed
users to stick a more complex partitioning class using:

<Partitioner>org.apache.cassandra.dht.RandomPartitioner</Partitioner>

(See CASSANDRA-3)

This is not an elegant solution, but I'm only making it quicker and dirtier :)

Perhaps we should postpone this discussion to after we resolve CASSANDRA-3 ?


On Tue, Mar 31, 2009 at 5:02 PM, Alexander Staubo
<madevilgenius@gmail.com> wrote:
> On Mon, Mar 30, 2009 at 10:24 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
>> But I do think there is nothing wrong with partitioner-per-namespace.
>> It should be straightfoward to implement (once we have real namespace
>> support to begin with) and it might be interesting for some apps to
>> have that ability.
>
> I can think of plenty of reasons why you would want or need to go
> beyond mere namespaces. In my opinion "table" or possibly "database"
> are the only sensible terms to describe such a division.
>
> For example, tables ought to support different replication factors
> (BigTable and HBase both support this). You might also want to specify
> different database directories for each table, eg. to distribute them
> across several disks.
>
> There are all sorts of settings you will want to apply differently to
> different tables due to usage semantics; for example, I imagine
> Cassandra could be improved to more efficiently supporting streaming
> of large blobs of binary data, GFS-style; and that some of that
> support may be enabled by table-level settings (eg., flags to set
> streaming buffers, append semantics or whatever). I also imagine the
> partitioning and compaction algorithms could mature into providing
> user-definable settings that could be tweaked according to load
> requirements.
>
> It should also be possible to easily delete an entire table without
> touching other tables. For testing purposes, for example, I would like
> to be able to load an entire table into the system, play with it, then
> drop the entire thing, without having to go through the process of a
> whole new, separate Cassandra. Using temporary tables to store result
> sets is also very common in MapReduce applications.
>
> Alexander.
>

Mime
View raw message