lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From starz10de <farag_ah...@yahoo.com>
Subject Re: highlighter by using term offsets
Date Thu, 24 Nov 2011 13:19:59 GMT
Hi,

here is the full part of the code:

  public static void doPagingSearch(BufferedReader in, Searcher searcher,
Query query, 
                                     int hitsPerPage, boolean raw, boolean
interactive) throws IOException, ParseException,
InvalidTokenOffsetsException {
 
	 
    // Collect enough docs to show 5 pages
    TopScoreDocCollector collector = TopScoreDocCollector.create(
        500 * hitsPerPage, false);
    searcher.search(query, collector);
    ScoreDoc[] hits = collector.topDocs().scoreDocs;
     
    int numTotalHits = collector.getTotalHits();
  //  System.out.println(numTotalHits + " total matching documents");
if(numTotalHits>0)
{
    int start = 0;
    int end = Math.min(numTotalHits, hitsPerPage);
       
     
      if (end > hits.length) {
        System.out.println("Only results 1 - " + hits.length +" of " +
numTotalHits + " total matching documents collected.");
        System.out.println("Collect more (y/n) ?");
        collector = TopScoreDocCollector.create(numTotalHits, false);
        searcher.search(query, collector);
        hits = collector.topDocs().scoreDocs;
      }
      
      end = Math.min(hits.length, start + hitsPerPage);
       
      for (int i = 0; i < hits.length; i++) {
        if (raw) {                              // output raw format
          System.out.println("doc="+hits[i].doc+" score="+hits[i].score);
          continue;
        }
        Document doc = searcher.doc(hits[i].doc);
        String path = doc.get("path");
        contents=doc.get("contents");   
  TermPositionVector tpv = (TermPositionVector)reader.getTermFreqVecto
(hits[i].doc,"contents");  
            TokenStream tokenStream=TokenSources.getTokenStream(tpv); 
    
            String result = 
                highlighter.getBestFragments( 
                    tokenStream, contents, 1, "..."); 
            System.out.println("\n" + result); 
 

  }

When I proint "content" and "hits[i].doc" I see that are not null.
The problem is in this line 

"TermPositionVector tpv = (TermPositionVector)reader.getTermFreqVecto
(hits[i].doc,"contents"); "

hits[i].doc represent the Doc id or ?

Thanks

--
View this message in context: http://lucene.472066.n3.nabble.com/highlighter-by-using-term-offsets-tp3527712p3533610.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message