lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "R. Tan" <tanrihae...@gmail.com>
Subject Re: Scoring for specific field queries
Date Fri, 09 Oct 2009 07:55:36 GMT
Thanks, I'll give this a go. What are the replacements for, the special
character and 20 char? Also, do you get results such as "XXXX formula"?

On Fri, Oct 9, 2009 at 3:45 PM, Avlesh Singh <avlesh@gmail.com> wrote:

> I have a very similar set-up for my auto-suggest (I am sorry that it can't
> be viewed from an external network).
> I am sending you my field definitions, please use them and see if it works
> out correctly.
>
> <fieldType name="autocomplete" class="solr.TextField">
>     <analyzer type="index">
>        <tokenizer class="solr.KeywordTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory" />
>         <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^a-z0-9])" replacement="" replace="all" />
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>        <filter class="solr.EdgeNGramFilterFactory" maxGramSize="100"
> minGramSize="1" />
>     </analyzer>
>    <analyzer type="query">
>        <tokenizer class="solr.KeywordTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory" />
>         <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^a-z0-9])" replacement="" replace="all" />
>        <filter class="solr.PatternReplaceFilterFactory"
> pattern="^(.{20})(.*)?" replacement="$1" replace="all" />
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
> </fieldType>
>
> <fieldType name="tokenized_autocomplete" class="solr.TextField">
>     <analyzer type="index">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory" />
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>        <filter class="solr.EdgeNGramFilterFactory" maxGramSize="100"
> minGramSize="1" />
>     </analyzer>
>    <analyzer type="query">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory" />
>         <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^a-z0-9])" replacement="" replace="all" />
>        <filter class="solr.PatternReplaceFilterFactory"
> pattern="^(.{20})(.*)?" replacement="$1" replace="all" />
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
> </fieldType>
>
> <field name="suggestion" type="autocomplete" indexed="true"
> stored="false"/>
> <field name="tokenized_suggestion" type="tokenized_autocomplete"
> indexed="true" stored="true"/>
>
> q=(suggestion:formula^2 tokenized_suggestion:formula)
>
> Hope this helps.
>
> Cheers
> Avlesh
>
> On Fri, Oct 9, 2009 at 1:03 PM, R. Tan <tanrihaed58@gmail.com> wrote:
>
> > Yeah, I do get results. Anything else I missed out?
> > I want it to work like this site's auto suggest feature.
> >
> > http://www.sematext.com/demo/ac/index.html
> >
> > Try the keyword 'formula'.
> >
> > Thanks,
> > Rih
> >
> >
> > On Fri, Oct 9, 2009 at 3:24 PM, Avlesh Singh <avlesh@gmail.com> wrote:
> >
> > > Can you just do q=autoCompleteHelper2:caf to see you get results?
> > >
> > > Cheers
> > > Avlesh
> > >
> > > On Fri, Oct 9, 2009 at 12:53 PM, R. Tan <tanrihaed58@gmail.com> wrote:
> > >
> > > > Yup, it is. Both are copied from another field called name.
> > > >
> > > > On Fri, Oct 9, 2009 at 3:15 PM, Avlesh Singh <avlesh@gmail.com>
> wrote:
> > > >
> > > > > Lame question, but are you populating data in the
> autoCompleteHelper2
> > > > > field?
> > > > >
> > > > > Cheers
> > > > > Avlesh
> > > > >
> > > > > On Fri, Oct 9, 2009 at 12:36 PM, R. Tan <tanrihaed58@gmail.com>
> > wrote:
> > > > >
> > > > > > The problem is, I'm getting equal scores for this:
> > > > > > Query:
> > > > > > q=(autoCompleteHelper2:caf^10.0 autoCompleteHelper:caf)
> > > > > >
> > > > > > Partial Result:
> > > > > >
> > > > > > <doc>
> > > > > > <float name="score">0.7821733</float>
> > > > > > <str name="autoCompleteHelper">Bikes Café</str>
> > > > > > </doc>
> > > > > >
> > > > > > <doc>
> > > > > > <float name="score">0.7821733</float>
> > > > > > <str name="autoCompleteHelper">Cafe Feliy</str>
> > > > > > </doc>
> > > > > >
> > > > > > I'm using the standard request handler with this.
> > > > > >
> > > > > > Thanks,
> > > > > > Rih
> > > > > >
> > > > > >
> > > > > > On Fri, Oct 9, 2009 at 3:02 PM, R. Tan <tanrihaed58@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Avlesh,
> > > > > > > I don't see anything wrong with the data from analysis.
> > > > > > >
> > > > > > > KeywordTokenized:
> > > > > > >
> > > > > > > *term position ** **1** **2** **3** **4** **5** **6** **7**
> **8**
> > > > **9**
> > > > > > **
> > > > > > > 10** **11** **12** **13** **14** **15** **16** **...*
> > > > > > > *term text ** **th** **he** **e ** **c** **ch** **ha**
**am**
> > > **mp**
> > > > > > **pi*
> > > > > > > * **io** **on** **the** **he ** **e c** **ch** **cha**
**...*
> > > > > > > *term type ** **word** **word** **word** **word** **word**
> > **word**
> > > > > > **word
> > > > > > > ** **word** **word** **word** **word** **word** **word**
> **word**
> > > > > > **word**
> > > > > > > **word** **...*
> > > > > > > *source start,end ** **0,2** **1,3** **2,4** **3,5** **4,6**
> > > **5,7**
> > > > > > **6,8
> > > > > > > ** **7,9** **8,10** **9,11** **10,12** **0,3** **1,4**
**2,5**
> > > > **3,6**
> > > > > **
> > > > > > > ...*
> > > > > > >
> > > > > > > WhitespaceTokenized:
> > > > > > >
> > > > > > > *term position ** **1** **2** **3** **4** **5** **6** **7**
> **8**
> > > > **9**
> > > > > > **
> > > > > > > 10** **11** **...*
> > > > > > > *term text ** **th** **he** **the** **ch** **ha** **am**
**mp**
> > > > **pi**
> > > > > **
> > > > > > > io** **on** **cha** **...*
> > > > > > > *term type ** **word** **word** **word** **word** **word**
> > **word**
> > > > > > **word
> > > > > > > ** **word** **word** **word** **word** **...*
> > > > > > > *source start,end ** **0,2** **1,3** **0,3** **0,2** **1,3**
> > > **2,4**
> > > > > > **3,5
> > > > > > > ** **4,6** **5,7** **6,8** **...*
> > > > > > >
> > > > > > > Is term position considered during scoring?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Rih
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Oct 9, 2009 at 9:40 AM, Avlesh Singh <avlesh@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > >> Use the field analysis tool to see how the data is
being
> > analyzed
> > > in
> > > > > > both
> > > > > > >> the fields.
> > > > > > >>
> > > > > > >> Cheers
> > > > > > >> Avlesh
> > > > > > >>
> > > > > > >> On Fri, Oct 9, 2009 at 12:56 AM, R. Tan <
> tanrihaed58@gmail.com>
> > > > > wrote:
> > > > > > >>
> > > > > > >> > Hmm... I don't quite get the desired results.
Those starting
> > > with
> > > > > > "cha"
> > > > > > >> are
> > > > > > >> > now randomly ordered. Is there something wrong
with the
> > filters
> > > I
> > > > > > >> applied?
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > On Thu, Oct 8, 2009 at 7:38 PM, Avlesh Singh <
> > avlesh@gmail.com>
> > > > > > wrote:
> > > > > > >> >
> > > > > > >> > > Filters? I did not mean filters at all.
> > > > > > >> > > I am in a mad rush right now, but on the
face of it your
> > field
> > > > > > >> > definitions
> > > > > > >> > > look right.
> > > > > > >> > >
> > > > > > >> > > This is what I asked for -
> > > > > > >> > > q=(autoComplete2:cha^10 autoComplete:cha)
> > > > > > >> > >
> > > > > > >> > > Lemme know if this does not work for you.
> > > > > > >> > >
> > > > > > >> > > Cheers
> > > > > > >> > > Avlesh
> > > > > > >> > >
> > > > > > >> > > On Thu, Oct 8, 2009 at 4:58 PM, R. Tan <
> > tanrihaed58@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >> > >
> > > > > > >> > > > Hi Avlesh,
> > > > > > >> > > >
> > > > > > >> > > > I can't seem to get the scores right.
> > > > > > >> > > >
> > > > > > >> > > > I now have these types for the fields
I'm targeting,
> > > > > > >> > > >
> > > > > > >> > > > <fieldType name="autoComplete" class="solr.TextField"
> > > > > > >> > > > positionIncrementGap="1">
> > > > > > >> > > >      <analyzer type="index">
> > > > > > >> > > >        <tokenizer
> > class="solr.WhitespaceTokenizerFactory"/>
> > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
/>
> > > > > > >> > > >        <filter class="solr.NGramFilterFactory"
> > > minGramSize="1"
> > > > > > >> > > > maxGramSize="20"/>
> > > > > > >> > > >      </analyzer>
> > > > > > >> > > >      <analyzer type="query">
> > > > > > >> > > >        <tokenizer
> > class="solr.WhitespaceTokenizerFactory"/>
> > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
/>
> > > > > > >> > > >      </analyzer>
> > > > > > >> > > >    </fieldType>
> > > > > > >> > > >    <fieldType name="autoComplete2"
> class="solr.TextField"
> > > > > > >> > > > positionIncrementGap="1">
> > > > > > >> > > >      <analyzer type="index">
> > > > > > >> > > >        <tokenizer class="solr.KeywordTokenizerFactory"/>
> > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
/>
> > > > > > >> > > >        <filter class="solr.NGramFilterFactory"
> > > minGramSize="1"
> > > > > > >> > > > maxGramSize="20"/>
> > > > > > >> > > >      </analyzer>
> > > > > > >> > > >      <analyzer type="query">
> > > > > > >> > > >        <tokenizer class="solr.KeywordTokenizerFactory"/>
> > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
/>
> > > > > > >> > > >      </analyzer>
> > > > > > >> > > >    </fieldType>
> > > > > > >> > > >
> > > > > > >> > > > My query is this,
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> q=*:*&fq=autoCompleteHelper:cha+autoCompleteHelper2:cha&qf=autoCompleteHelper^10.0+autoCompleteHelper2^1.0
> > > > > > >> > > >
> > > > > > >> > > > What should I tweak from the above config
and query?
> > > > > > >> > > >
> > > > > > >> > > > Thanks,
> > > > > > >> > > > Rih
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > On Thu, Oct 8, 2009 at 4:38 PM, R. Tan
<
> > > tanrihaed58@gmail.com
> > > > >
> > > > > > >> wrote:
> > > > > > >> > > >
> > > > > > >> > > > > I will have to pass on this and
try your suggestion
> > first.
> > > > So,
> > > > > > how
> > > > > > >> > does
> > > > > > >> > > > > your suggestion (1 and 2) boost
the my startswith
> query?
> > > Is
> > > > it
> > > > > > >> > because
> > > > > > >> > > of
> > > > > > >> > > > > the n-gram filter?
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > On Thu, Oct 8, 2009 at 2:27 PM,
Sandeep Tagore <
> > > > > > >> > > sandeep.tagore@gmail.com
> > > > > > >> > > > >wrote:
> > > > > > >> > > > >
> > > > > > >> > > > >>
> > > > > > >> > > > >> Yes it can be done but it needs
some customization.
> > > Search
> > > > > for
> > > > > > >> > custom
> > > > > > >> > > > sort
> > > > > > >> > > > >> implementations/discussions.
> > > > > > >> > > > >> You can check...
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html
> > > > > > >> > > > >> .
> > > > > > >> > > > >> Let us know if you have any
issues.
> > > > > > >> > > > >>
> > > > > > >> > > > >> Sandeep
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > > >> R. Tan wrote:
> > > > > > >> > > > >> >
> > > > > > >> > > > >> > This might work and I
also have a single value
> field
> > > > which
> > > > > > >> makes
> > > > > > >> > it
> > > > > > >> > > > >> > cleaner.
> > > > > > >> > > > >> > Can sort be customized
(with indexOf()) from the
> solr
> > > > > > >> parameters
> > > > > > >> > > > alone?
> > > > > > >> > > > >> >
> > > > > > >> > > > >>
> > > > > > >> > > > >> --
> > > > > > >> > > > >> View this message in context:
> > > > > > >> > > > >>
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.nabble.com/Scoring-for-specific-field-queries-tp25798390p25799055.html
> > > > > > >> > > > >> Sent from the Solr - User mailing
list archive at
> > > > Nabble.com.
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message