lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: svn commit: r659664 - in /lucene/solr/trunk: CHANGES.txt src/java/org/apache/solr/common/params/HighlightParams.java src/java/org/apache/solr/highlight/DefaultSolrHighlighter.java src/test/org/apache/solr/highlight/HighlighterTest.java
Date Fri, 23 May 2008 22:47:28 GMT
I'm getting compile errors on clean :
  [mkdir] Created dir:/solr-trunk/build/core
     [javac] Compiling 314 source files to ...solr-trunk/build/core
     [javac] ...solr-trunk/src/java/org/apache/solr/highlight/ 
DefaultSolrHighlighter.java:45: cannot find symbol
     [javac] symbol  : class SpanScorer
     [javac] location: package org.apache.lucene.search.highlight
     [javac] import org.apache.lucene.search.highlight.SpanScorer;
     [javac]                                           ^
     [javac] ...solr-trunk/src/java/org/apache/solr/highlight/ 
DefaultSolrHighlighter.java:144: cannot find symbol
     [javac] symbol  : class SpanScorer
     [javac] location: class  
org.apache.solr.highlight.DefaultSolrHighlighter
     [javac]   private SpanScorer getSpanQueryScorer(Query query,  
String fieldName, CachingTokenFilter tokenStream, SolrQueryRequest  
request) throws IOException {
     [javac]           ^
     [javac] ...solr-trunk/src/java/org/apache/solr/highlight/ 
DefaultSolrHighlighter.java:147: cannot find symbol
     [javac] symbol  : class SpanScorer
     [javac] location: class  
org.apache.solr.highlight.DefaultSolrHighlighter
     [javac]       return new SpanScorer(query, fieldName, tokenStream);
     [javac]                  ^
     [javac] ...solr-trunk/src/java/org/apache/solr/highlight/ 
DefaultSolrHighlighter.java:150: cannot find symbol
     [javac] symbol  : class SpanScorer
     [javac] location: class  
org.apache.solr.highlight.DefaultSolrHighlighter
     [javac]       return new SpanScorer(query, null, tokenStream);
     [javac]                  ^
     [javac] Note: Some input files use or override a deprecated API.
     [javac] Note: Recompile with -Xlint:deprecation for details.
     [javac] Note: Some input files use unchecked or unsafe operations.
     [javac] Note: Recompile with -Xlint:unchecked for details.
     [javac] 4 errors

SVN Info:

 >svn info
Path: .
URL: https://svn.apache.org/repos/asf/lucene/solr/trunk
Repository Root: https://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 659696
Node Kind: directory
Schedule: normal
Last Changed Author: otis
Last Changed Rev: 659668
Last Changed Date: 2008-05-23 17:32:45 -0400 (Fri, 23 May 2008)

Doesn't this require new Lucene jars?

-Grant



On May 23, 2008, at 5:23 PM, otis@apache.org wrote:

> Author: otis
> Date: Fri May 23 14:23:25 2008
> New Revision: 659664
>
> URL: http://svn.apache.org/viewvc?rev=659664&view=rev
> Log:
> SOLR-553 Use SpanScorer when highlighting phrase terms and  
> hl.usePhraseHighlighter=true
>
> Modified:
>    lucene/solr/trunk/CHANGES.txt
>    lucene/solr/trunk/src/java/org/apache/solr/common/params/ 
> HighlightParams.java
>    lucene/solr/trunk/src/java/org/apache/solr/highlight/ 
> DefaultSolrHighlighter.java
>    lucene/solr/trunk/src/test/org/apache/solr/highlight/ 
> HighlighterTest.java
>
> Modified: lucene/solr/trunk/CHANGES.txt
> URL: http://svn.apache.org/viewvc/lucene/solr/trunk/CHANGES.txt?rev=659664&r1=659663&r2=659664&view=diff
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- lucene/solr/trunk/CHANGES.txt (original)
> +++ lucene/solr/trunk/CHANGES.txt Fri May 23 14:23:25 2008
> @@ -409,7 +409,14 @@
>
> 31. SOLR-514: Added explicit media-type with UTF* charset to *.xsl  
> files that
>     don't already have one. (hossman)
> -
> +
> +32. SOLR-505: Give RequestHandlers the possiblity to suppress the  
> generation
> +    of HTTP caching headers.  (Thomas Peuss via Otis Gospodnetic)
> +
> +33. SOLR-553: Handle highlighting of phrase terms better when
> +    hl.usePhraseHighligher=true URL param is used.
> +    (Bojan Smid via Otis Gospodnetic)
> +
> Other Changes
>  1. SOLR-135: Moved common classes to org.apache.solr.common and  
> altered the
>     build scripts to make two jars: apache-solr-1.3.jar and
>
> Modified: lucene/solr/trunk/src/java/org/apache/solr/common/params/ 
> HighlightParams.java
> URL: http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/common/params/HighlightParams.java?rev=659664&r1=659663&r2=659664&view=diff
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- lucene/solr/trunk/src/java/org/apache/solr/common/params/ 
> HighlightParams.java (original)
> +++ lucene/solr/trunk/src/java/org/apache/solr/common/params/ 
> HighlightParams.java Fri May 23 14:23:25 2008
> @@ -33,6 +33,8 @@
>   public static final String FIELD_MATCH = HIGHLIGHT 
> +".requireFieldMatch";
>   public static final String ALTERNATE_FIELD = HIGHLIGHT 
> +".alternateField";
>   public static final String ALTERNATE_FIELD_LENGTH = HIGHLIGHT 
> +".maxAlternateFieldLength";
> +
> +  public static final String USE_PHRASE_HIGHLIGHTER = HIGHLIGHT 
> +".usePhraseHighlighter";
>
>   public static final String MERGE_CONTIGUOUS_FRAGMENTS = HIGHLIGHT  
> + ".mergeContiguous";
>   // Formatter
>
> Modified: lucene/solr/trunk/src/java/org/apache/solr/highlight/ 
> DefaultSolrHighlighter.java
> URL: http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/highlight/DefaultSolrHighlighter.java?rev=659664&r1=659663&r2=659664&view=diff
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- lucene/solr/trunk/src/java/org/apache/solr/highlight/ 
> DefaultSolrHighlighter.java (original)
> +++ lucene/solr/trunk/src/java/org/apache/solr/highlight/ 
> DefaultSolrHighlighter.java Fri May 23 14:23:25 2008
> @@ -32,6 +32,7 @@
> import javax.xml.xpath.XPathConstants;
>
> import org.apache.lucene.analysis.Analyzer;
> +import org.apache.lucene.analysis.CachingTokenFilter;
> import org.apache.lucene.analysis.Token;
> import org.apache.lucene.analysis.TokenFilter;
> import org.apache.lucene.analysis.TokenStream;
> @@ -41,6 +42,7 @@
> import org.apache.lucene.search.highlight.Fragmenter;
> import org.apache.lucene.search.highlight.Highlighter;
> import org.apache.lucene.search.highlight.QueryScorer;
> +import org.apache.lucene.search.highlight.SpanScorer;
> import org.apache.lucene.search.highlight.TextFragment;
> import org.apache.lucene.search.highlight.TokenSources;
> import org.apache.solr.common.SolrException;
> @@ -55,7 +57,6 @@
> import org.apache.solr.search.DocIterator;
> import org.apache.solr.search.DocList;
> import org.apache.solr.search.SolrIndexSearcher;
> -import org.apache.solr.util.SolrPluginUtils;
> import org.apache.solr.util.plugin.NamedListPluginLoader;
> import org.w3c.dom.NodeList;
>
> @@ -92,6 +93,27 @@
>     formatters.put( null, fmt );
>   }
>
> +  /**
> +   * Return a phrase Highlighter appropriate for this field.
> +   * @param query The current Query
> +   * @param fieldName The name of the field
> +   * @param request The current SolrQueryRequest
> +   * @param tokenStream document text CachingTokenStream
> +   * @throws IOException
> +   */
> +  protected Highlighter getPhraseHighlighter(Query query, String  
> fieldName, SolrQueryRequest request, CachingTokenFilter tokenStream)  
> throws IOException {
> +    SolrParams params = request.getParams();
> +    Highlighter highlighter = null;
> +
> +    highlighter = new Highlighter(getFormatter(fieldName, params),  
> getSpanQueryScorer(query, fieldName, tokenStream, request));
> +
> +    highlighter.setTextFragmenter(getFragmenter(fieldName, params));
> +    highlighter.setMaxDocBytesToAnalyze(params.getFieldInt(
> +        fieldName, HighlightParams.MAX_CHARS,
> +        Highlighter.DEFAULT_MAX_DOC_BYTES_TO_ANALYZE));
> +
> +    return highlighter;
> +  }
>
>   /**
>    * Return a Highlighter appropriate for this field.
> @@ -112,6 +134,24 @@
>   }
>
>   /**
> +   * Return a SpanScorer suitable for this Query and field.
> +   * @param query The current query
> +   * @param tokenStream document text CachingTokenStream
> +   * @param fieldName The name of the field
> +   * @param request The SolrQueryRequest
> +   * @throws IOException
> +   */
> +  private SpanScorer getSpanQueryScorer(Query query, String  
> fieldName, CachingTokenFilter tokenStream, SolrQueryRequest request)  
> throws IOException {
> +    boolean reqFieldMatch =  
> request.getParams().getFieldBool(fieldName,  
> HighlightParams.FIELD_MATCH, false);
> +    if (reqFieldMatch) {
> +      return new SpanScorer(query, fieldName, tokenStream);
> +    }
> +    else {
> +      return new SpanScorer(query, null, tokenStream);
> +    }
> +  }
> +
> +  /**
>    * Return a QueryScorer suitable for this Query and field.
>    * @param query The current query
>    * @param fieldName The name of the field
> @@ -230,32 +270,59 @@
>           fieldName = fieldName.trim();
>           String[] docTexts = doc.getValues(fieldName);
>           if (docTexts == null) continue;
> +
> +          TokenStream tstream = null;
> +
> +          // create TokenStream
> +          if (docTexts.length == 1) {
> +            // single-valued field
> +            try {
> +              // attempt term vectors
> +              tstream =  
> TokenSources.getTokenStream(searcher.getReader(), docId, fieldName);
> +            }
> +            catch (IllegalArgumentException e) {
> +              // fall back to anaylzer
> +              tstream = new  
> TokenOrderingFilter(schema.getAnalyzer().tokenStream(fieldName, new  
> StringReader(docTexts[0])), 10);
> +            }
> +          }
> +          else {
> +            // multi-valued field
> +            tstream = new MultiValueTokenStream(fieldName,  
> docTexts, schema.getAnalyzer(), true);
> +          }
> +
> +          Highlighter highlighter;
> +
> +          if  
> (Boolean 
> .valueOf 
> (req.getParams().get(HighlightParams.USE_PHRASE_HIGHLIGHTER))) {
> +            // wrap CachingTokenFilter around TokenStream for reuse
> +            tstream = new CachingTokenFilter(tstream);
> +
> +            // get highlighter
> +            highlighter = getPhraseHighlighter(query, fieldName,  
> req, (CachingTokenFilter) tstream);
> +
> +            // after highlighter initialization, reset tstream  
> since construction of highlighter already used it
> +            tstream.reset();
> +          }
> +          else {
> +            // use "the old way"
> +            highlighter = getHighlighter(query, fieldName, req);
> +          }
>
> -          // get highlighter, and number of fragments for this field
> -          Highlighter highlighter = getHighlighter(query,  
> fieldName, req);
>           int numFragments = getMaxSnippets(fieldName, params);
>           boolean mergeContiguousFragments =  
> isMergeContiguousFragments(fieldName, params);
>
>            String[] summaries = null;
>            TextFragment[] frag;
>            if (docTexts.length == 1) {
> -              // single-valued field
> -              TokenStream tstream;
> -              try {
> -                 // attempt term vectors
> -                 tstream =  
> TokenSources.getTokenStream(searcher.getReader(), docId, fieldName);
> -              }
> -              catch (IllegalArgumentException e) {
> -                 // fall back to analyzer
> -                 tstream = new  
> TokenOrderingFilter(schema.getAnalyzer().tokenStream(fieldName, new  
> StringReader(docTexts[0])), 10);
> -              }
>               frag = highlighter.getBestTextFragments(tstream,  
> docTexts[0], mergeContiguousFragments, numFragments);
>            }
>            else {
> -              // multi-valued field
> -              MultiValueTokenStream tstream;
> -              tstream = new MultiValueTokenStream(fieldName,  
> docTexts, schema.getAnalyzer(), true);
> -              frag = highlighter.getBestTextFragments(tstream,  
> tstream.asSingleValue(), false, numFragments);
> +               StringBuilder singleValue = new StringBuilder();
> +
> +               for (String txt:docTexts) {
> +             	  singleValue.append(txt);
> +               }
> +
> +              frag = highlighter.getBestTextFragments(tstream,  
> singleValue.toString(), false, numFragments);
>            }
>            // convert fragments back into text
>            // TODO: we can include score and position information in  
> output as snippet attributes
> @@ -303,12 +370,8 @@
>   }
> }
>
> -
> -
> -
> /**
> - * Helper class which creates a single TokenStream out of values  
> from a
> - * multi-valued field.
> + * Creates a single TokenStream out multi-value field values.
>  */
> class MultiValueTokenStream extends TokenStream {
>   private String fieldName;
> @@ -378,7 +441,6 @@
>       sb.append(str);
>     return sb.toString();
>   }
> -
> }
>
>
> @@ -424,5 +486,3 @@
>     return queue.isEmpty() ? null : queue.removeFirst();
>   }
> }
> -
> -
>
> Modified: lucene/solr/trunk/src/test/org/apache/solr/highlight/ 
> HighlighterTest.java
> URL: http://svn.apache.org/viewvc/lucene/solr/trunk/src/test/org/apache/solr/highlight/HighlighterTest.java?rev=659664&r1=659663&r2=659664&view=diff
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- lucene/solr/trunk/src/test/org/apache/solr/highlight/ 
> HighlighterTest.java (original)
> +++ lucene/solr/trunk/src/test/org/apache/solr/highlight/ 
> HighlighterTest.java Fri May 23 14:23:25 2008
> @@ -481,4 +481,59 @@
>             "//lst[@name='highlighting']/lst[@name='1']/ 
> arr[@name='t_text']/str[.='a piece of text']"
>             );
>   }
> +
> +  public void testPhraseHighlighter() {
> +    HashMap<String,String> args = new HashMap<String,String>();
> +    args.put("hl", "true");
> +    args.put("hl.fl", "t_text");
> +    args.put("hl.fragsize", "40");
> +    args.put("hl.snippets", "10");
> +
> +    TestHarness.LocalRequestFactory sumLRF = h.getRequestFactory(
> +      "standard", 0, 200, args);
> +
> +    // String borrowed from Lucene's HighlighterTest
> +    String t = "This piece of text refers to Kennedy at the  
> beginning then has a longer piece of text that is very long in the  
> middle and finally ends with another reference to Kennedy";
> +
> +    assertU(adoc("t_text", t, "id", "1"));
> +    assertU(commit());
> +    assertU(optimize());
> +
> +    String oldHighlight1 = "//lst[@name='1']/arr[@name='t_text']/ 
> str[.='This piece of <em>text</em> <em>refers</em> to Kennedy']";
> +    String oldHighlight2 = "//lst[@name='1']/arr[@name='t_text']/ 
> str[.=' at the beginning then has a longer piece of <em>text</em>']";
> +    String oldHighlight3 = "//lst[@name='1']/arr[@name='t_text']/ 
> str[.=' with another <em>reference</em> to Kennedy']";
> +    String newHighlight1 = "//lst[@name='1']/arr[@name='t_text']/ 
> str[.='This piece of <em>text</em> <em>refers</em> to Kennedy']";
> +
> +    // check if old functionality is still the same
> +    assertQ("Phrase highlighting - old",
> +        sumLRF.makeRequest("t_text:\"text refers\""),
> +        "//lst[@name='highlighting']/lst[@name='1']",
> +        oldHighlight1, oldHighlight2, oldHighlight3
> +        );
> +
> +    assertQ("Phrase highlighting - old",
> +        sumLRF.makeRequest("t_text:text refers"),
> +        "//lst[@name='highlighting']/lst[@name='1']",
> +        oldHighlight1, oldHighlight2, oldHighlight3
> +        );
> +
> +    // now check if Lucene-794 highlighting works as expected
> +    args.put("hl.usePhraseHighlighter", "true");
> +
> +    sumLRF = h.getRequestFactory("standard", 0, 200, args);
> +
> +    // check phrase highlighting
> +    assertQ("Phrase highlighting - Lucene-794",
> +        sumLRF.makeRequest("t_text:\"text refers\""),
> +        "//lst[@name='highlighting']/lst[@name='1']",
> +        newHighlight1
> +        );
> +
> +    // non phrase queries should be highlighted as they were before  
> this fix
> +    assertQ("Phrase highlighting - Lucene-794",
> +        sumLRF.makeRequest("t_text:text refers"),
> +        "//lst[@name='highlighting']/lst[@name='1']",
> +        oldHighlight1, oldHighlight2, oldHighlight3
> +        );
> +  }
> }
>
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








Mime
View raw message