lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen ...@statsbiblioteket.dk>
Subject Re: Proper use of TermsEnum.seek?
Date Fri, 25 Feb 2011 12:28:05 GMT
On Tue, 2011-02-22 at 12:19 +0100, Simon Willnauer wrote: 

[Toke: Using a partial cache of BytesRef+TermState]

> I don't know how you did implement that part but you might consider
> using something like ByteBlockPool instead of BytesRef instances to
> safe an extra amount of memory. Just as a hint you can look at
> BytesRefHash for an example.

Avoiding the overhead of representing the BytesRefs as separate Objects
seems sensible. Unfortunately this isn't possible with TermState, at
least not in general. As I focus quite a bit on memory overhead, tt
might make sense to just store the BytesRef and take the performance
penalty of seek(BytesRef) to avoid the Object-overhead of TermStats.

> I think we need to check if that BytesRef is really needed. I hope we
> can get rid of it eventually.

It does seem a bit peculiar that is is needed for a seek using a
previously delivered marker. Maybe the TermState could hold a reference
to the BytesRef itself, if it is needed by the implementation?

Regards,
Toke Eskildsen


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message