lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Szymon Sutek <dagothvers...@gmail.com>
Subject Unable to retrieve TermVectorOffsets using Lucene 6
Date Fri, 02 Dec 2016 11:48:34 GMT
Hello, I am trying to index a txt file and then retrieve it's terms offset
positions.(if it occured more than once while indexing) I present most
important parts of the code:

1)StandardAnalyzer used.
2)FieldType used while indexing.

    FieldType fieldType = new FieldType();

    fieldType.setTokenized(true);
    fieldType.setStoreTermVectors(true);
    fieldType.setStoreTermVectorPositions(true);
    fieldType.setStoreTermVectorOffsets(true);

fieldType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);

3)doc.add(new Field("fieldname",reader,fieldType))


4)After succesfully creating index, I am using indexReader to read terms.
and iterate through all of them but I have no idea how to collect
offsetVector.
In earlier versions I would cast to needed vector from TermVector and get
offset List for a concrete term value. Now I stuck on this part of code:

Terms terms =  indexReader.getTermVector(0,"text");
TermsEnum iterator  = terms.iterator();

BytesRef byteRef = null;

while((byteRef = iterator.next()) != null) {
    String term = byteRef.utf8ToString();
    //Here I dont know how to get offset vector for given term
}

I would be grateful for any help!

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message