lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Vitaly_Arte...@McAfee.com>
Subject Lucene 4.0.0 - find offsets for phrase queries
Date Mon, 17 Dec 2012 16:46:05 GMT
Hi all,
I use Lucene 4.0.
I try to find offsets for phrase queries.
My code works then I search for one word but then I call it for some phrase I didn't get offsets.
termsEnum.seekExact returns false for phrase queries.

reader = DirectoryReader.open( mIndexDir );
               IndexSearcher searcher = new IndexSearcher(reader);
               QueryParser parser = new QueryParser(Version.LUCENE_40, mField, mAnalyzer);
               Query query = parser.parse(aQuery);

               TopScoreDocCollector collector = TopScoreDocCollector.create(100, true);
               searcher.search(query, collector);
               ScoreDoc[] hits = collector.topDocs().scoreDocs;

               for(int i=0;i<hits.length;++i) {
                   int docId = hits[i].doc;

                   Document d = searcher.doc(docId);

                   Terms tfvector = reader.getTermVector(docId, "contents");

                   if( tfvector != null )
                   {
                      TermsEnum termsEnum = tfvector.iterator(null);

                      if ( termsEnum.seekExact(new BytesRef( aQuery.toLowerCase() ), false
) )
                      {
                             DocsAndPositionsEnum dpEnum = null;
                             dpEnum = termsEnum.docsAndPositions(null, dpEnum);

if( dpEnum != null )
                             {
                                   int freq = dpEnum.freq();

                                   int maxOcc = 20;

                                    while( freq-- > 0 && maxOcc-- > 0 ) {
                                          dpEnum.nextPosition();
                                         System.out.println("Start offset " + dpEnum.startOffset()
+ " End offset " + dpEnum.endOffset());
                                    }
                             }
}

What is the problem?

Thanks in advance, Vitaly.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message