incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: Per-Namespace / Per-Table Partitioner
Date Wed, 01 Apr 2009 02:20:26 GMT
Yes, I had loadable Partitioners implemented but it is out now pending
Avinash's new OPHF...

On Tue, Mar 31, 2009 at 8:58 PM, Sandeep Tata <> wrote:
> I agree with Alexander.
> The partitioner per-namespace, while useful for some apps, really ends
> up looking like a quick and dirty hack for multiple tables.
> You could achieve all of what Neophytos described in his example by
> sticking the logic in the partitioner class if we eventually allowed
> users to stick a more complex partitioning class using:
> <Partitioner>org.apache.cassandra.dht.RandomPartitioner</Partitioner>
> This is not an elegant solution, but I'm only making it quicker and dirtier :)
> Perhaps we should postpone this discussion to after we resolve CASSANDRA-3 ?
> On Tue, Mar 31, 2009 at 5:02 PM, Alexander Staubo
> <> wrote:
>> On Mon, Mar 30, 2009 at 10:24 PM, Jonathan Ellis <> wrote:
>>> But I do think there is nothing wrong with partitioner-per-namespace.
>>> It should be straightfoward to implement (once we have real namespace
>>> support to begin with) and it might be interesting for some apps to
>>> have that ability.
>> I can think of plenty of reasons why you would want or need to go
>> beyond mere namespaces. In my opinion "table" or possibly "database"
>> are the only sensible terms to describe such a division.
>> For example, tables ought to support different replication factors
>> (BigTable and HBase both support this). You might also want to specify
>> different database directories for each table, eg. to distribute them
>> across several disks.
>> There are all sorts of settings you will want to apply differently to
>> different tables due to usage semantics; for example, I imagine
>> Cassandra could be improved to more efficiently supporting streaming
>> of large blobs of binary data, GFS-style; and that some of that
>> support may be enabled by table-level settings (eg., flags to set
>> streaming buffers, append semantics or whatever). I also imagine the
>> partitioning and compaction algorithms could mature into providing
>> user-definable settings that could be tweaked according to load
>> requirements.
>> It should also be possible to easily delete an entire table without
>> touching other tables. For testing purposes, for example, I would like
>> to be able to load an entire table into the system, play with it, then
>> drop the entire thing, without having to go through the process of a
>> whole new, separate Cassandra. Using temporary tables to store result
>> sets is also very common in MapReduce applications.
>> Alexander.

View raw message