jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <mreut...@adobe.com>
Subject RE: Oak Scalability: Load Distribution
Date Mon, 03 Mar 2014 08:20:07 GMT

> >The path depth is prepended to
> >the path to ensure that the nodes are distributed more equally.
> Actually, the reason for the prefix is not that the nodes are distributed
> more equally, but so that queries for child nodes are efficient, and so
> that siblings are stored next to each other. Queries for child nodes are
> range queries of the form "id between '2:/content/' and '2:/content0'".
> This is efficient because MongoDB keeps a documents sorted by id. For more
> details about range queries, see
> http://docs.mongodb.org/manual/core/index-single/

with Joel's approach, this would still be the case. all siblings have the same prefix.
I think it's an interesting idea, because the content structures he mentions are
quite common. e.g. think of the way how user information is structured. it is
all stored at the same depth and will likely be located on the same shard. an
alternative distribution is certainly desirable when this kind of content is 
modified often. 

> >A much better distribution could be achieved if the hash/checksum of the
> >parent node path would be used instead of the path depth.
> Sure, we can do some experiments and try it out.

Agreed, I think we should give it a try and compare how the two key formats perform.


View raw message