lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Term vector Lucene 4.2
Date Tue, 02 Apr 2013 13:44:18 GMT
On Tue, Apr 2, 2013 at 12:45 PM, andi rexha <a_rexha@hotmail.com> wrote:
> Hi Adrien,
> Thank you very much for the reply.
>
> I have two other small question about this:
> 1) Is  "final int freq = docsAndPositions.freq();" the same with "iterator.totalTermFreq()"
? In my tests it returns the same result and from the documentation it seems that the result
should be the same.

In case of term vectors, the docs enums contain only one document so
iterator.totalTermFreq() and docsAndPositions.freq() are equal. This
would not be true if you consumed AtomicReader.fields() (since the
docs enums would have several documents).

> 2) How do I get the offsets for the term vector? I have tried to iterate over the docsAndPositions
but I get the following exception:
>
> Exception in thread "main" java.lang.IllegalStateException: Position enum not started

You need to call startOffset and endOffset just after nextPosition:

        for (int i = 0; i < freq; ++i) {
          final int position = docsAndPositions.nextPosition();
          // 'position' is the i-th position of the current term in the document
          final int startOffset = docsAndPositions.startOffset();
          final int endOffset = docsAndPositions.endOffset();
          // offsets of the i-th term
        }

Beware that these methods will return -1 if you did not index offsets
(see FieldType.setIndexOptions and
IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS).

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message