lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: API access to in-memory tii file (3.x not flex).
Date Wed, 10 Nov 2010 21:39:03 GMT
In a word, no.  You'd need to customize the Lucene source to accomplish this.

On Wed, Nov 10, 2010 at 1:02 PM, Burton-West, Tom <tburtonw@umich.edu> wrote:
> Hello all,
>
> We have an extremely large number of terms in our indexes.  I want to be able to extract
a sample of the terms, say something like every 128th term.   If I use code based on org.apache.lucene.misc.HighFreqTerms
or org.apache.lucene.index.CheckIndex I would get a TermsEnum, call termEnum.next() 128 times,
grab the term and then call next another 128 times.
> termEnum = reader.terms();
> while (termEnum.next()
> { }
>
> Since the tii file contains every 128th (or IndexInterval ) term and it is loaded into
memory, is there some programmatic way (in the public API) to read that data structure in
memory rather than having to force Lucene to actually read the entire tis file by using termEnum.next()
?
>
>
> Tom Burton-West
> http://www.hathitrust.org/blogs/large-scale-search
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message