lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <>
Subject [jira] Closed: (LUCENE-1308) Remove String.intern() from to increase performance and lower contention
Date Tue, 11 Aug 2009 00:32:14 GMT


Mark Miller closed LUCENE-1308.

    Resolution: Duplicate

> Remove String.intern() from to increase performance and lower contention
> -----------------------------------------------------------------------------------
>                 Key: LUCENE-1308
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.3.2
>            Reporter: Rene Schwietzke
>         Attachments:
> Right now, *document.Field is interning all field names. While this makes sense because
it lowers the overall memory consumption, the method intern() of String is know to be difficult
to handle. 
> 1) it is a native call and therefore slower than anything on the Java level
> 2) the String pool is part of the perm space and not of the general heap, so it's size
is more restricted and needs extra VM params to be managed
> 3) Some VMs show GC problems with strings in the string pool
> Suggested solution is a WeakHashMap instead, that takes care of unifying the String instances
and at the same time keeping the pool in the heap space and releasing the String when it is
not longer needed. For extra performance in a concurrent environment, a ConcurrentHashMap-like
implementation of a weak hashmap is recommended, because we mostly read from the pool.
> We saw a 10% improvement in throughout and response time of our application and the application
is not only doing searches (we read a lot of documents from the result). So a single measurement
test case could show even more improvement in single and concurrent usage.
> The Cache:
> /** Cache to replace the expensive String.intern() call with the java version */
> private final static Map<String, WeakReference<String>> unifiedStringsCache
>    Collections.synchronizedMap(new WeakHashMap<String, WeakReference<String>>(109));
> The access to it, instead of = name.intern;
> // unify the strings, but do not use the expensive String.intern() version
> // which is not "weak enough", uses the perm space and is a native call
> String unifiedName = null;
> WeakReference<String> ref = unifiedStringsCache.get(name);
> if (ref != null)
> {
>     unifiedName = ref.get();
> }
> if (unifiedName == null)
> {
>     unifiedStringsCache.put(name, new WeakReference(name));
>     unifiedName = name;
> }
> = unifiedName;
> I guess it is sufficient to have mostly all fields names interned, so I skipped the additional
synchronization around the access and take the risk that only 99.99% :) of all field names
are interned.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message