lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: possible TermInfosReader speedup
Date Wed, 08 Apr 2009 22:01:39 GMT
> >> Also, on the other topic - how hard is it to boost
> >> TermEnum.skipTo(term) speed to IndexReader.terms(term) level? Would be
> >> nice for TrieRangeFilter and probably some other filters.
> > I think all that's needed is to implement SegmentTermEnum.skipTo,
> > calling something like tis.terms(Term) but instead of returning a
> > cloned SegmentTermEnum, overwrite the one passed in?
> I bet at least MultiSegmentReader.MultiTermEnum should be affected
> too? (I'm looking at 2.3.2 sources)
> > Does TrieRangeFilter use TermEnum.skipTo?  If so, we should certainly
> fix this.
> It doesn't, but only because skipTo is so obviously slow + I have
> another filter in my project that could use skipTo.
> Refer to:
> 1470?focusedCommentId=12651318&page=com.atlassian.jira.plugin.system.issue
> tabpanels%3Acomment-tabpanel#action_12651318
> Uwe> I am fine with calling IndexReader.terms(Term) to use the cache
> and faster seeking. The cost of creating new instances of TermEnums is
> less than doing disk reads.

I am fascinated; you remember my question... :-)

Yes, if seekTo would work more performant, I could easily use it in
TrieRange and would be happy as noted before. Currently, a new TermEnum is
created on each sub-range. When TrieRange was committed and therefore
updated, for me it was (and still is) not clear, why skipTo may not be as
fast as a new TermEnum. 

> But other people (like me) might use mmapped indexes, so cost(new
> TermEnum)/cost(index read) relation looks different for us.
> > See also this, for historical context:
> >
> ipto+page:1+mid:lb46mbbgpgbnnuxk+state:results
> Darn! And api-wise it looks like a legitimate method :)


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message