lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Subject Remove duplicate suggestions in Solr
Date Fri, 21 Aug 2015 05:41:09 GMT
Hi,

I would like to check, is there anyway to remove duplicate suggestions in
Solr?
I have several documents that looks very similar, and when I do a
suggestion query, it came back with all same results. I'm using Solr 5.2.1

This is my suggestion pipeline:

<requestHandler name="/suggest" class="solr.SearchHandler">
<lst name="defaults">
<!-- Browse specific stuff -->
<str name="echoParams">all</str>
  <str name="wt">json</str>
  <str name="indent">true</str>

<!-- Everything below should be identical to "ac" handler above -->
<str name="defType">edismax</str>
<str name="rows">10</str>
<str name="fl">id, score</str>
<!--<str name="qf">textsuggest^30 extrasearch^30.0 textng^50.0
phonetic^10</str>-->
<!--<str name="qf">content^50 title^50 extrasearch^30.0 textng^1.0
textng2^200.0</str>-->
<str name="qf">content^50 title^50 extrasearch^30.0</str>
<str name="pf">textnge^50.0</str>
<!--<str name="bf">product(log(sum(popularity,1)),100)^20</str>-->
<!-- Define relative importance between types. May be overridden per
request by e.g. &personboost=120 -->
<str
name="boost">product(map(query($type1query),0,0,1,$type1boost),map(query($type2query),0,0,1,$type2boost),map(query($type3query),0,0,1,$type3boost),map(query($type4query),0,0,1,$type4boost),$typeboost)</str>
<double name="typeboost">1.0</double>

<str name="type1query">content_type:"application/pdf"</str>
<double name="type1boost">0.9</double>
<str name="type2query">content_type:"application/msword"</str>
<double name="type2boost">0.5</double>
<str name="type3query">content_type:"NA"</str>
<double name="type3boost">0.0</double>
<str name="type4query">content_type:"NA"</str>
<double name="type4boost">0.0</double>
  <str name="hl">on</str>
  <str name="hl.fl">id, textng, textng2, language_s</str>
  <str name="hl.highlightMultiTerm">true</str>
  <str name="hl.preserveMulti">true</str>
  <str name="hl.encoder">html</str>
  <!--<str name="f.content.hl.fragsize">80</str>-->
  <str name="hl.fragsize">50</str>
<str name="debugQuery">false</str>
</lst>
</requestHandler>

This is my query:
http://localhost:8983/edm/chinese2/suggest?q=do our
best&defType=edismax&qf=content^5 textng^5&pf=textnge^50&pf2=content^20
textnge^50&pf3=content^40%20textnge^50&ps2=2&ps3=2&stats.calcdistinct=true


This is the suggestion result:

 "highlighting":{
    "responsibility001":{
      "id":["responsibility001"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],
    "responsibility002":{
      "id":["responsibility002"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],
    "responsibility003":{
      "id":["responsibility003"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],
    "responsibility004":{
      "id":["responsibility004"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],
    "responsibility005":{
      "id":["responsibility005"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],
    "responsibility006":{
      "id":["responsibility006"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],
    "responsibility007":{
      "id":["responsibility007"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],
    "responsibility008":{
      "id":["responsibility008"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],
    "responsibility009":{
      "id":["responsibility009"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],
    "responsibility010":{
      "id":["responsibility010"],
      "textng":["We will strive to <em>do</em> <em>our</em> <em>best</em>.
 &lt;br&gt; "],


Regards,
Edwin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message