cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Evans <eev...@rackspace.com>
Subject Re: Is this sentence slightly inaccurate
Date Wed, 07 Apr 2010 23:00:08 GMT
On Wed, 2010-04-07 at 15:13 -0700, Paul Prescod wrote:
> "With OrderPreservingPartitioner the keys themselves are used to place
> on the ring. One of the potential drawbacks of this approach is that
> if rows are inserted with sequential keys, all the write load will go
> to the same node."

Yeah, this isn't very good IMO, (incomplete and misleading at best).

> http://wiki.apache.org/cassandra/StorageConfiguration
> 
> Wouldn't the "insertion point" tend to be replicated on more than one
> node in most configurations? Does every "insertion point" exist on a
> single "primary" machine or are writes load-balanced to
> ReplicationFactor nodes? I presume that writes can fail-over, so I
> cannot see why they could not be load balanced.

Keys are routed to a node based on the key and the partitioner used.
Replica placement is based off of this location (node). In the simple
case (rack-unaware strategy), this is simply the next N-1 nodes.


> Also: Dominic Williams says that one of the advantages of the
> OrderPreservingPartitioner is: "3. If you screw up, you can scan over
> your data to recover/delete orphaned keys"
> 
> Does anyone know off the top of their head what he might have meant by
> that? 

Using key enumeration when you no longer know what your keys are? I
dunno. I imagine that whatever he was referring to, it's no longer worth
mentioning in the context of the OrderPreservingPartitioner since you
can also enumerate keys with RandomPartitioner (that wasn't always the
case).

-- 
Eric Evans
eevans@rackspace.com


Mime
View raw message