incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: OPHF
Date Thu, 02 Apr 2009 15:48:06 GMT
On Thu, Apr 2, 2009 at 9:04 AM, Avinash Lakshman
<> wrote:
> So how are you coming up with the tokens here?

For Token<String> the key's tokens (that is, the value we compare
against the node tokens to determine partitioning) are just the key
itself, wrapped in the Token interface.

Node token generation gets moved into the IPartitioner interface, and
the partitioner is loaded with class.forName, so users can easily
define a token generator that meets their needs.  (For instance for my
app I will probably need to hand-specify tokens in a config file at
first until I can work on load balancing.)  But there is a random
string generator that will work out of the box for testing.

> What do you mean by string[0]
> and string[last]? Are they the keys that belong to the system?

I meant the tokens assigned to the nodes.  So, the "next" node after
the last node is the first node.  It doesn't matter where the
wraparound point is for the key domain (there usually won't be one,
for strings), but that's okay, because the important concept is not
the wrap point but what the "next" node is for replication etc.

That is, the algorithm for getStorageEndPoints is unchanged -- get a
sorted list of tokens/node mappings, and binary search on the key's
token.  Doesn't matter if the tokens are BigIntegers or Strings.

> In
> this scheme does it mean you will need to sort the tokens too using
> collation scheme used by the external partitioner while identifying which
> key goes to which node.

Right.  I add a compareTo method (and reverseCompareTo for
convenience) to the IPartitioner interface and that needs to be used
wherever we were comparing or sorting.

> Also could you provide me with the patch number. I
> need to go over that this weekend and make sure if it does not affect the
> (bootstrap/load balance) logic.

The old ones from #3 won't apply any more because of conflicts but I
will rebase them against the current trunk.  (When do you expect to
commit again?  I can wait for that to make sure it will be a clean
apply if that is in the near future.)

> My fear is that these changes have long
> reaching effects and I need to make sure that all these pieces will continue
> to reside in harmony. Also your range query test did it run in a distributed
> environment or in a single box environment?

We're testing on a five-node cluster.


View raw message