jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roman Puchkovskiy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-1663) REFERENCE properties produce duplicate strings in memory
Date Thu, 26 Jun 2008 14:45:50 GMT

    [ https://issues.apache.org/jira/browse/JCR-1663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608457#action_12608457
] 

Roman Puchkovskiy commented on JCR-1663:
----------------------------------------

We have a big repository with several tenths of thousands of nodes (which have versions).
Each node has 4 reference properties.
I'm now looking at our application heap snapshot. Its overall size is 400+ mb and almost 200
mb are strings (nearly 1.4 million of them). Most of those strings are in NameImpl instances
(more than 1.3 million of instances). And nearly 1.3 million of strings are actually duplicates
of 8 different strings which are local names and full names of our 4 reference properties.
So I think it's a real issue that needs to be resolved :) Our use case is an example when
Jackrabbit uses memory inefficiently.

Jukka's suggestion looks much better than mine: consider 1.3 million of NameImpl instances
in the example.

> REFERENCE properties produce duplicate strings in memory
> --------------------------------------------------------
>
>                 Key: JCR-1663
>                 URL: https://issues.apache.org/jira/browse/JCR-1663
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: jackrabbit-core, jackrabbit-spi-commons
>    Affects Versions: 1.4, core 1.4.5
>            Reporter: Roman Puchkovskiy
>
> When reference property is loaded from PM, Serializer.deserialize(NodeReferences, InputStream)
is called, which calls PropertyId.valueOf(String), which in turn calls NameFactoryImpl.create(String)
which finally splits a full property name to namespace and local name. Namespace is internalized,
but local name is not (comments say that this is done to avoid perm space overfilling).
> So, in the end, a new String instance is created for local name. This leads to considerable
memory waste when repository has a lot of nodes with REFERENCE properties.
> It seems that local name part could be internalized here too because in the most repositories
it's not allowed to create properties with arbitrary names, so the danger of perm space exhaust
does not seem to be an argument.
> As for ways to resolve this, maybe a new NameFactory implementation could be created
which would be used for properties only (and, possibly, mainly in the PropertyId.valueOf(String))
which would extend an existing NameFactoryImpl overriding its create(String) method.
> What do you think about all this?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message