jackrabbit-oak-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: MongoMK Revision.toString() usage
Date Thu, 16 May 2013 14:39:37 GMT
Hi,

As for MongoDB, there are quite many known performance problems, for
example memory usage of the cache, and the missing document split. And
performance is very bad if there are many child nodes.

But your are right, there are many unnecessary transformation to/from
string that should be avoided. Unfortunately, the whole MicroKernel API
has lots of to/from string conversion, but let's keep that for now :-)
Also a problem is unnecessary object creation, specially because the
performance overhead of this is usually not quite easy to measure.

I think we should first do that algorithmic optimisations (for example
split documents), and then work on remaining performance problems as we
measure them.

We should look at performance of Oak as a whole, to avoid optimizing the
"wrong" (non-bottleneck) component.

Regards,
Thomas








On 5/16/13 4:13 PM, "Lukas Eder" <mar09086@adobe.com> wrote:

>Hello,
>
>I've been investigating Oak performance and found a couple of cases where
>MongoMK makes use of stringified versions of the
>org.apache.jackrabbit.mongomk.Revision type. One example of such a
>problem was reported here [1].
>
>I have a couple of things that I'd like to talk about, concerning
>Revision.toString() usage:
>
> 1.  toString() should hardly ever be used for anything other than
>debugging. It is very hard to find relevant matching references of
>Object.toString() in a Java code base. In other words, toString() is
>almost not "refactorable". E.g. is hard to predict what side-effects a
>change to the toString() behaviour will have. Ideally, toString() should
>delegate to another method, such as format(), where the predicable logic
>is really implemented.
> 2.  Revision has a much lower memory footprint than its string
>representation, so it is actually be better suited to be used in maps,
>caches, etc.
> 3.  MongoMK.isCommitted() is an example that shows how revisions are
>unnecessarily transformed back and forth to strings.
>
>In a larger profiling session, Revision.fromString() and
>Revision.toString() accounted for a total of around 1.5% of all CPU time
>on my machine.
>This may seem like micro-optimisation to some, but I think that we should
>take these things seriously, as they might add up to a significant amount
>of CPU and memory waste, if practiced across a large code base.
>
>Please, let me know what you think.
>
>Cheers
>Lukas
>
>[1]: https://issues.apache.org/jira/browse/OAK-825


Mime
View raw message