lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "R. Tan" <tanrihae...@gmail.com>
Subject Re: Scoring for specific field queries
Date Fri, 09 Oct 2009 08:26:24 GMT
I ended up with the same set of results earlier but I don't results such as
"the champion", I think because of the EdgeNGram filter.

With NGram, I'm back to the same problem:

Result for q=ca

<doc>
<float name="score">0.8717008</float>
<str name="tokenized_suggestion">Blu Jazz Cafe</str>
</doc>

<doc>
<float name="score">0.8717008</float>
<str name="tokenized_suggestion">Café in the Pond</str>
</doc>


On Fri, Oct 9, 2009 at 4:02 PM, R. Tan <tanrihaed58@gmail.com> wrote:

> How do these filters help the autosuggest?
> <filter class="solr.PatternReplaceFilterFactory" pattern="^(.{20})(.*)?"
> replacement="$1" replace="all" />
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>
>
>
> On Fri, Oct 9, 2009 at 3:59 PM, Avlesh Singh <avlesh@gmail.com> wrote:
>
>> >
>> > What are the replacements for, the special character and 20 char?
>> >
>> I had no time to diff between your definitions and mine. Copy-pasting mine
>> was easier :)
>>
>> Also, do you get results such as "XXXX formula"?
>> >
>> The "autocomplete" field would definitely not match this query, but the
>> "tokenized autocomplete" would.
>> Give it a shot, it should work as you expect it to.
>>
>> Cheers
>> Avlesh
>>
>> On Fri, Oct 9, 2009 at 1:25 PM, R. Tan <tanrihaed58@gmail.com> wrote:
>>
>> > Thanks, I'll give this a go. What are the replacements for, the special
>> > character and 20 char? Also, do you get results such as "XXXX formula"?
>> >
>> > On Fri, Oct 9, 2009 at 3:45 PM, Avlesh Singh <avlesh@gmail.com> wrote:
>> >
>> > > I have a very similar set-up for my auto-suggest (I am sorry that it
>> > can't
>> > > be viewed from an external network).
>> > > I am sending you my field definitions, please use them and see if it
>> > works
>> > > out correctly.
>> > >
>> > > <fieldType name="autocomplete" class="solr.TextField">
>> > >     <analyzer type="index">
>> > >        <tokenizer class="solr.KeywordTokenizerFactory"/>
>> > >        <filter class="solr.LowerCaseFilterFactory" />
>> > >         <filter class="solr.PatternReplaceFilterFactory"
>> > > pattern="([^a-z0-9])" replacement="" replace="all" />
>> > >        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>> > >        <filter class="solr.EdgeNGramFilterFactory" maxGramSize="100"
>> > > minGramSize="1" />
>> > >     </analyzer>
>> > >    <analyzer type="query">
>> > >        <tokenizer class="solr.KeywordTokenizerFactory"/>
>> > >        <filter class="solr.LowerCaseFilterFactory" />
>> > >         <filter class="solr.PatternReplaceFilterFactory"
>> > > pattern="([^a-z0-9])" replacement="" replace="all" />
>> > >        <filter class="solr.PatternReplaceFilterFactory"
>> > > pattern="^(.{20})(.*)?" replacement="$1" replace="all" />
>> > >        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>> > >    </analyzer>
>> > > </fieldType>
>> > >
>> > > <fieldType name="tokenized_autocomplete" class="solr.TextField">
>> > >     <analyzer type="index">
>> > >        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>> > >        <filter class="solr.LowerCaseFilterFactory" />
>> > >         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>> > >        <filter class="solr.EdgeNGramFilterFactory" maxGramSize="100"
>> > > minGramSize="1" />
>> > >     </analyzer>
>> > >    <analyzer type="query">
>> > >        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>> > >        <filter class="solr.LowerCaseFilterFactory" />
>> > >         <filter class="solr.PatternReplaceFilterFactory"
>> > > pattern="([^a-z0-9])" replacement="" replace="all" />
>> > >        <filter class="solr.PatternReplaceFilterFactory"
>> > > pattern="^(.{20})(.*)?" replacement="$1" replace="all" />
>> > >        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>> > >    </analyzer>
>> > > </fieldType>
>> > >
>> > > <field name="suggestion" type="autocomplete" indexed="true"
>> > > stored="false"/>
>> > > <field name="tokenized_suggestion" type="tokenized_autocomplete"
>> > > indexed="true" stored="true"/>
>> > >
>> > > q=(suggestion:formula^2 tokenized_suggestion:formula)
>> > >
>> > > Hope this helps.
>> > >
>> > > Cheers
>> > > Avlesh
>> > >
>> > > On Fri, Oct 9, 2009 at 1:03 PM, R. Tan <tanrihaed58@gmail.com> wrote:
>> > >
>> > > > Yeah, I do get results. Anything else I missed out?
>> > > > I want it to work like this site's auto suggest feature.
>> > > >
>> > > > http://www.sematext.com/demo/ac/index.html
>> > > >
>> > > > Try the keyword 'formula'.
>> > > >
>> > > > Thanks,
>> > > > Rih
>> > > >
>> > > >
>> > > > On Fri, Oct 9, 2009 at 3:24 PM, Avlesh Singh <avlesh@gmail.com>
>> wrote:
>> > > >
>> > > > > Can you just do q=autoCompleteHelper2:caf to see you get results?
>> > > > >
>> > > > > Cheers
>> > > > > Avlesh
>> > > > >
>> > > > > On Fri, Oct 9, 2009 at 12:53 PM, R. Tan <tanrihaed58@gmail.com>
>> > wrote:
>> > > > >
>> > > > > > Yup, it is. Both are copied from another field called name.
>> > > > > >
>> > > > > > On Fri, Oct 9, 2009 at 3:15 PM, Avlesh Singh <avlesh@gmail.com>
>> > > wrote:
>> > > > > >
>> > > > > > > Lame question, but are you populating data in the
>> > > autoCompleteHelper2
>> > > > > > > field?
>> > > > > > >
>> > > > > > > Cheers
>> > > > > > > Avlesh
>> > > > > > >
>> > > > > > > On Fri, Oct 9, 2009 at 12:36 PM, R. Tan <
>> tanrihaed58@gmail.com>
>> > > > wrote:
>> > > > > > >
>> > > > > > > > The problem is, I'm getting equal scores for this:
>> > > > > > > > Query:
>> > > > > > > > q=(autoCompleteHelper2:caf^10.0 autoCompleteHelper:caf)
>> > > > > > > >
>> > > > > > > > Partial Result:
>> > > > > > > >
>> > > > > > > > <doc>
>> > > > > > > > <float name="score">0.7821733</float>
>> > > > > > > > <str name="autoCompleteHelper">Bikes Café</str>
>> > > > > > > > </doc>
>> > > > > > > >
>> > > > > > > > <doc>
>> > > > > > > > <float name="score">0.7821733</float>
>> > > > > > > > <str name="autoCompleteHelper">Cafe Feliy</str>
>> > > > > > > > </doc>
>> > > > > > > >
>> > > > > > > > I'm using the standard request handler with this.
>> > > > > > > >
>> > > > > > > > Thanks,
>> > > > > > > > Rih
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Fri, Oct 9, 2009 at 3:02 PM, R. Tan <
>> tanrihaed58@gmail.com>
>> > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Avlesh,
>> > > > > > > > > I don't see anything wrong with the data
from analysis.
>> > > > > > > > >
>> > > > > > > > > KeywordTokenized:
>> > > > > > > > >
>> > > > > > > > > *term position ** **1** **2** **3** **4**
**5** **6**
>> **7**
>> > > **8**
>> > > > > > **9**
>> > > > > > > > **
>> > > > > > > > > 10** **11** **12** **13** **14** **15** **16**
**...*
>> > > > > > > > > *term text ** **th** **he** **e ** **c**
**ch** **ha**
>> **am**
>> > > > > **mp**
>> > > > > > > > **pi*
>> > > > > > > > > * **io** **on** **the** **he ** **e c** **ch**
**cha**
>> **...*
>> > > > > > > > > *term type ** **word** **word** **word**
**word** **word**
>> > > > **word**
>> > > > > > > > **word
>> > > > > > > > > ** **word** **word** **word** **word** **word**
**word**
>> > > **word**
>> > > > > > > > **word**
>> > > > > > > > > **word** **...*
>> > > > > > > > > *source start,end ** **0,2** **1,3** **2,4**
**3,5**
>> **4,6**
>> > > > > **5,7**
>> > > > > > > > **6,8
>> > > > > > > > > ** **7,9** **8,10** **9,11** **10,12** **0,3**
**1,4**
>> > **2,5**
>> > > > > > **3,6**
>> > > > > > > **
>> > > > > > > > > ...*
>> > > > > > > > >
>> > > > > > > > > WhitespaceTokenized:
>> > > > > > > > >
>> > > > > > > > > *term position ** **1** **2** **3** **4**
**5** **6**
>> **7**
>> > > **8**
>> > > > > > **9**
>> > > > > > > > **
>> > > > > > > > > 10** **11** **...*
>> > > > > > > > > *term text ** **th** **he** **the** **ch**
**ha** **am**
>> > **mp**
>> > > > > > **pi**
>> > > > > > > **
>> > > > > > > > > io** **on** **cha** **...*
>> > > > > > > > > *term type ** **word** **word** **word**
**word** **word**
>> > > > **word**
>> > > > > > > > **word
>> > > > > > > > > ** **word** **word** **word** **word** **...*
>> > > > > > > > > *source start,end ** **0,2** **1,3** **0,3**
**0,2**
>> **1,3**
>> > > > > **2,4**
>> > > > > > > > **3,5
>> > > > > > > > > ** **4,6** **5,7** **6,8** **...*
>> > > > > > > > >
>> > > > > > > > > Is term position considered during scoring?
>> > > > > > > > >
>> > > > > > > > > Thanks,
>> > > > > > > > > Rih
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > On Fri, Oct 9, 2009 at 9:40 AM, Avlesh Singh
<
>> > avlesh@gmail.com
>> > > >
>> > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > >> Use the field analysis tool to see how
the data is being
>> > > > analyzed
>> > > > > in
>> > > > > > > > both
>> > > > > > > > >> the fields.
>> > > > > > > > >>
>> > > > > > > > >> Cheers
>> > > > > > > > >> Avlesh
>> > > > > > > > >>
>> > > > > > > > >> On Fri, Oct 9, 2009 at 12:56 AM, R. Tan
<
>> > > tanrihaed58@gmail.com>
>> > > > > > > wrote:
>> > > > > > > > >>
>> > > > > > > > >> > Hmm... I don't quite get the desired
results. Those
>> > starting
>> > > > > with
>> > > > > > > > "cha"
>> > > > > > > > >> are
>> > > > > > > > >> > now randomly ordered. Is there something
wrong with the
>> > > > filters
>> > > > > I
>> > > > > > > > >> applied?
>> > > > > > > > >> >
>> > > > > > > > >> >
>> > > > > > > > >> > On Thu, Oct 8, 2009 at 7:38 PM,
Avlesh Singh <
>> > > > avlesh@gmail.com>
>> > > > > > > > wrote:
>> > > > > > > > >> >
>> > > > > > > > >> > > Filters? I did not mean filters
at all.
>> > > > > > > > >> > > I am in a mad rush right now,
but on the face of it
>> your
>> > > > field
>> > > > > > > > >> > definitions
>> > > > > > > > >> > > look right.
>> > > > > > > > >> > >
>> > > > > > > > >> > > This is what I asked for -
>> > > > > > > > >> > > q=(autoComplete2:cha^10 autoComplete:cha)
>> > > > > > > > >> > >
>> > > > > > > > >> > > Lemme know if this does not
work for you.
>> > > > > > > > >> > >
>> > > > > > > > >> > > Cheers
>> > > > > > > > >> > > Avlesh
>> > > > > > > > >> > >
>> > > > > > > > >> > > On Thu, Oct 8, 2009 at 4:58
PM, R. Tan <
>> > > > tanrihaed58@gmail.com
>> > > > > >
>> > > > > > > > wrote:
>> > > > > > > > >> > >
>> > > > > > > > >> > > > Hi Avlesh,
>> > > > > > > > >> > > >
>> > > > > > > > >> > > > I can't seem to get the
scores right.
>> > > > > > > > >> > > >
>> > > > > > > > >> > > > I now have these types
for the fields I'm
>> targeting,
>> > > > > > > > >> > > >
>> > > > > > > > >> > > > <fieldType name="autoComplete"
>> class="solr.TextField"
>> > > > > > > > >> > > > positionIncrementGap="1">
>> > > > > > > > >> > > >      <analyzer type="index">
>> > > > > > > > >> > > >        <tokenizer
>> > > > class="solr.WhitespaceTokenizerFactory"/>
>> > > > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
>> />
>> > > > > > > > >> > > >        <filter class="solr.NGramFilterFactory"
>> > > > > minGramSize="1"
>> > > > > > > > >> > > > maxGramSize="20"/>
>> > > > > > > > >> > > >      </analyzer>
>> > > > > > > > >> > > >      <analyzer type="query">
>> > > > > > > > >> > > >        <tokenizer
>> > > > class="solr.WhitespaceTokenizerFactory"/>
>> > > > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
>> />
>> > > > > > > > >> > > >      </analyzer>
>> > > > > > > > >> > > >    </fieldType>
>> > > > > > > > >> > > >    <fieldType name="autoComplete2"
>> > > class="solr.TextField"
>> > > > > > > > >> > > > positionIncrementGap="1">
>> > > > > > > > >> > > >      <analyzer type="index">
>> > > > > > > > >> > > >        <tokenizer
>> > class="solr.KeywordTokenizerFactory"/>
>> > > > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
>> />
>> > > > > > > > >> > > >        <filter class="solr.NGramFilterFactory"
>> > > > > minGramSize="1"
>> > > > > > > > >> > > > maxGramSize="20"/>
>> > > > > > > > >> > > >      </analyzer>
>> > > > > > > > >> > > >      <analyzer type="query">
>> > > > > > > > >> > > >        <tokenizer
>> > class="solr.KeywordTokenizerFactory"/>
>> > > > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
>> />
>> > > > > > > > >> > > >      </analyzer>
>> > > > > > > > >> > > >    </fieldType>
>> > > > > > > > >> > > >
>> > > > > > > > >> > > > My query is this,
>> > > > > > > > >> > > >
>> > > > > > > > >> > > >
>> > > > > > > > >> > >
>> > > > > > > > >> >
>> > > > > > > > >>
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> q=*:*&fq=autoCompleteHelper:cha+autoCompleteHelper2:cha&qf=autoCompleteHelper^10.0+autoCompleteHelper2^1.0
>> > > > > > > > >> > > >
>> > > > > > > > >> > > > What should I tweak from
the above config and
>> query?
>> > > > > > > > >> > > >
>> > > > > > > > >> > > > Thanks,
>> > > > > > > > >> > > > Rih
>> > > > > > > > >> > > >
>> > > > > > > > >> > > >
>> > > > > > > > >> > > > On Thu, Oct 8, 2009 at
4:38 PM, R. Tan <
>> > > > > tanrihaed58@gmail.com
>> > > > > > >
>> > > > > > > > >> wrote:
>> > > > > > > > >> > > >
>> > > > > > > > >> > > > > I will have to pass
on this and try your
>> suggestion
>> > > > first.
>> > > > > > So,
>> > > > > > > > how
>> > > > > > > > >> > does
>> > > > > > > > >> > > > > your suggestion (1
and 2) boost the my startswith
>> > > query?
>> > > > > Is
>> > > > > > it
>> > > > > > > > >> > because
>> > > > > > > > >> > > of
>> > > > > > > > >> > > > > the n-gram filter?
>> > > > > > > > >> > > > >
>> > > > > > > > >> > > > >
>> > > > > > > > >> > > > >
>> > > > > > > > >> > > > > On Thu, Oct 8, 2009
at 2:27 PM, Sandeep Tagore <
>> > > > > > > > >> > > sandeep.tagore@gmail.com
>> > > > > > > > >> > > > >wrote:
>> > > > > > > > >> > > > >
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > > >> Yes it can be
done but it needs some
>> customization.
>> > > > > Search
>> > > > > > > for
>> > > > > > > > >> > custom
>> > > > > > > > >> > > > sort
>> > > > > > > > >> > > > >> implementations/discussions.
>> > > > > > > > >> > > > >> You can check...
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > >
>> > > > > > > > >> > >
>> > > > > > > > >> >
>> > > > > > > > >>
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html
>> > > > > > > > >> > > > >> .
>> > > > > > > > >> > > > >> Let us know if
you have any issues.
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > > >> Sandeep
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > > >> R. Tan wrote:
>> > > > > > > > >> > > > >> >
>> > > > > > > > >> > > > >> > This might
work and I also have a single value
>> > > field
>> > > > > > which
>> > > > > > > > >> makes
>> > > > > > > > >> > it
>> > > > > > > > >> > > > >> > cleaner.
>> > > > > > > > >> > > > >> > Can sort
be customized (with indexOf()) from
>> the
>> > > solr
>> > > > > > > > >> parameters
>> > > > > > > > >> > > > alone?
>> > > > > > > > >> > > > >> >
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > > >> --
>> > > > > > > > >> > > > >> View this message
in context:
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > >
>> > > > > > > > >> > >
>> > > > > > > > >> >
>> > > > > > > > >>
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://www.nabble.com/Scoring-for-specific-field-queries-tp25798390p25799055.html
>> > > > > > > > >> > > > >> Sent from the
Solr - User mailing list archive
>> at
>> > > > > > Nabble.com.
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > > >>
>> > > > > > > > >> > > > >
>> > > > > > > > >> > > >
>> > > > > > > > >> > >
>> > > > > > > > >> >
>> > > > > > > > >>
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message