lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Miller <>
Subject Re: [jira] Updated: (LUCENE-1607) String.intern() faster alternative
Date Sun, 19 Apr 2009 18:36:25 GMT
This implementation suffers from thread visibility problems too - changes 
to the array's values aren't guaranteed to be visible across threads. In 
addition to that, there's also a problem with hash collisions invalidating 
cache entries which could greatly degrade performance in several common use 
cases. For example, suppose we had a nested loop iterating docs and the doc's 
field names, interning the names as we went. If two fields (F1, F2) both 
hashed to the same array index the cache would never be hit since we'd be 
alternating between interning F1 and F2. Without benchmarking/testing it's 
hard to know how big a problem that would be in practice, but the thread 
visibility problem seems potentially serious.

> [
> ira.plugin.system.issuetabpanels:all-tabpanel ]
> Yonik Seeley updated LUCENE-1607:
> ---------------------------------
> Attachment: LUCENE-1607.patch
> Here's a completely lockless and memory barrier free intern() cache.
> This default would be more back compatible since programs may rely on
> String instances being interned via String.intern().
> It does not yet include corresponding Lucene code changes to use the
> StringInterner.
> Thoughts?
>> String.intern() faster alternative
>> ----------------------------------
>> Key: LUCENE-1607
>> URL:
>> Project: Lucene - Java
>> Issue Type: Improvement
>> Reporter: Earwin Burrfoot
>> Fix For: 2.9
>> Attachments: intern.patch, LUCENE-1607.patch
>> By using our own interned string pool on top of default,
>> String.intern() can be greatly optimized.
>> On my setup (java 6) this alternative runs ~15.8x faster for already
>> interned strings, and ~2.2x faster for 'new String(interned)'
>> For java 5 and 4 speedup is lower, but still considerable.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message