jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: Oak Scalability: Load Distribution
Date Mon, 03 Mar 2014 10:41:09 GMT
Hi,

>>> Agreed, I think we should give it a try and compare how the two key
>> >formats perform.
>> 
>> Yes. This is specially interesting as it is a simple solution for
>>OAK-333
>> (1000 byte path limit in MongoMK). Even if we find out performance is
>>much
>> worse than with the current solution, we could still use the hash
>>approach
>> for long paths.
>
>hmm, isn't that a different problem? so far we only discussed replacing
>the
>depth prefix with a hash prefix. AFAIU the full path would still be part
>of the
>key.

Sorry, my mistake. I wouldn't do that.

I would do is: from the parent of the path, calculate the SHA-256 hash
code, and replace that part of the key with the hash, and then maybe
convert that to Base64. So that:

/test/node1 -> swbZ+4R8Eg3X6wD86+XxGLmHWwnbZ62PCDSup6DYy4w/node1
/test/node2 -> swbZ+4R8Eg3X6wD86+XxGLmHWwnbZ62PCDSup6DYy4w/node2
/test2/node1 -> vcSCrX3V8DBPLZUNs6yowJeYbRUyU8wbLqDtg+oefHc/node1
/test2/node2 -> vcSCrX3V8DBPLZUNs6yowJeYbRUyU8wbLqDtg+oefHc/node2


That way, we have the same load distribution as with just replacing the
depth prefix, and at the same time we solved JCR-333 (except for cases
where the node name itself is extremely large).


I would only do that if the path of the parent is longer than 700
characters. That way, a node name has a limit of about 260 bytes (which I
_think_ should be a reasonable limit for node names - but I might be
wrong).

If we have that implementation, we can run some tests with using a
threshold of 1 characters, so that every node except for the root node is
stored like above.

Regards,
Thomas


Mime
View raw message