jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: content hash of a tree
Date Tue, 15 Oct 2013 07:13:27 GMT

>I thought this was one of the fundamental principles that
>allow for quick diff/merge and help identify a commit within the MVCC

I'm not sure where you heard or read that, but a content hash is not
needed for this; a revision number that is changed whenever there is a
modification is enough. (For each change in this node or any direct or
indirect child node.) For the MongoMK, the revision is a combination of
the timestamp, the cluster node id, and a counter; very similar to the
MongoDB object id: http://docs.mongodb.org/manual/reference/object-id/

>what was the reason to abandon this idea?

As for the MongoDB: performance and scalability. A node lookup by content
hash would be bad for performance, as it would require an index on a
randomly distributed data. See also
http://fr.slideshare.net/daumdna/mongodb-scaling-write-performance - page
9 (the red line is with an index on the content hash, the green line
without). But even without such an index: maintaining the content hash
would be prohibitively expensive and would prevent scalable writes.


View raw message