incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: OPHF vs. Random
Date Thu, 12 Mar 2009 00:48:05 GMT
Use Random for now.  The OPHF is the same as the old one, i.e., not
actually OP. :)

I'm pretty convinced at this point that it's impossible to have an
order-preserving hash that doesn't either (a) impose a relatively
short key length past which no partitioning is done (i.e., all keys w/
the same prefix go to the same node) or is (b) very sensitive to key
length such that the keys with a given length N will not be evenly
distributed across all nodes. Or both.

So I am working on migrating from pluggable hash functions key ->
BigInteger, to pluggable partitioning algorithms key -> EndPoint.
Without the requirement to transform to a numeric value first I think
I can create an order-preserving distribution that performs well.  (I
need this for range queries.)

So far I have just laid the foundation, here:
https://issues.apache.org/jira/browse/CASSANDRA-3

I hope to finish the rest tomorrow.

-Jonathan

On Wed, Mar 11, 2009 at 5:28 PM, Jiansheng Huang <jiansheng.wi@gmail.com> wrote:
>
> Which one is better to use? The default is Random.
>
> In Avinash's annoucement mail, we have
> (1) Ability to switch between a random hash and a OPHF. We still have the
> old (wrong) OPHF in there. I will update it to the corrected one tomorrow.
>
> Is correct OPHF in? Thanks.
>

Mime
View raw message