lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adriano Crestani <adrianocrest...@apache.org>
Subject Cloning TermAttribute objects
Date Tue, 13 Jul 2010 07:59:43 GMT
Hi,

Why TermAttributeImpl.clone() method uses buff.clone() instead of
System.arrayCopy to clone its internal buffer? Performance reasons?

I have the following scenario:

...
public boolean incrementToken() {
...
String twoHundredKCharsString = "abc....";
String smallString = "test";

termAttribute.setTermBuffer(twoHundredKCharsString);
State largeStringState = captureState();

termAttribute.setTermBuffer(smallString);
State smallStringState = captureState();

...
}
...

And guess what?! smallStringState has a TermAttribute object that
holds an internal buffer of 200k chars in size!!!

I was googling and found out that using cloning and arrayCopy has the
same performance for small arrays, and cloning just performs better
for large arrays.

So, if large string inputs are not a real scenario, why not use
arrayCopy instead of clone? But in case it's a real scenario, Lucene
should definitely not be copying the entire buffer for small strings.

Maybe TermAttribute interface could expose a method like
shrinkBuffer(), so the user could invoke when it needs to.

Thoughts?

Best Regards,
Adriano Crestani

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message