lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Kristensson <>
Subject Re: IndexWriter.close() performance issue
Date Tue, 23 Nov 2010 22:22:34 GMT
I've tried the suggestion below, but it really doesn't seem to have any impact. I guess that's
not surprising since 80% of the CPU time when I ran hprof was in String.intern(), not in the
StringHelper class. 

Clearly, if I'm going to hack things up at this level, I've got some work do to, including:
- Synchronizing access to the storage mechanism I use for the strings
- Using something else (String?) for the hashCode (instead of the String's built-in integer

Could I not use the HashMap solution I proposed to ensure that all field name strings that
are the same refer to the same instance of the String? Does a HashMap.get() not return the
same instance with every call? 

I'm also intrigued by a comment from Mike about FieldInfos and using those to control the
uniqueness. Any suggestions on how I go about doing that? Would that be instead of monkeying
with StringHelper or in addition to it?


On Nov 20, 2010, at 5:44 AM, Yonik Seeley wrote:

> On Fri, Nov 19, 2010 at 5:41 PM, Mark Kristensson
> <> wrote:
>> Here's the changes I made to org.apache.lucene.util.StringHelper:
>>  //public static StringInterner interner = new SimpleStringInterner(1024,8);
> As Mike said, the real fix for trunk is to get rid of interning.
> But for your version, you could try making the string intern cache larger.
> StringHelper.interner = new SimpleStringInterner(300000,8);
> -Yonik
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message