lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Lamb <brian.l...@journalexperts.com>
Subject Re: Edgengram
Date Tue, 31 May 2011 16:07:27 GMT
<fieldType name="edgengram" class="solr.TextField"
positionIncrementGap="1000">
   <analyzer>
     <tokenizer class="solr.LowerCaseTokenizerFactory" />
     <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
maxGramSize="25" side="front" />
   </analyzer>
</fieldType>

I believe I used that link when I initially set up the field and it worked
great (and I'm still using it in other places). In this particular example
however it does not appear to be practical for me. I mentioned that I have a
similarity class that returns 1 for the idf and in the case of an edgengram,
it returns 1 * length of the search string.

Thanks,

Brian Lamb

On Tue, May 31, 2011 at 11:34 AM, bmdakshinamurthy@gmail.com <
bmdakshinamurthy@gmail.com> wrote:

> Can you specify the analyzer you are using for your queries?
>
> May be you could use a KeywordAnalyzer for your queries so you don't end up
> matching parts of your query.
>
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
> This should help you.
>
> On Tue, May 31, 2011 at 8:24 PM, Brian Lamb
> <brian.lamb@journalexperts.com>wrote:
>
> > In this particular case, I will be doing a solr search based on user
> > preferences. So I will not be depending on the user to type "abcdefg".
> That
> > will be automatically generated based on user selections.
> >
> > The contents of the field do not contain spaces and since I am created
> the
> > search parameters, case isn't important either.
> >
> > Thanks,
> >
> > Brian Lamb
> >
> > On Tue, May 31, 2011 at 9:44 AM, Erick Erickson <erickerickson@gmail.com
> > >wrote:
> >
> > > That'll work for your case, although be aware that string types aren't
> > > analyzed at all,
> > > so case matters, as do spaces etc.....
> > >
> > > What is the use-case here? If you explain it a bit there might be
> > > better answers....
> > >
> > > Best
> > > Erick
> > >
> > > On Fri, May 27, 2011 at 9:17 AM, Brian Lamb
> > > <brian.lamb@journalexperts.com> wrote:
> > > > For this, I ended up just changing it to string and using "abcdefg*"
> to
> > > > match. That seems to work so far.
> > > >
> > > > Thanks,
> > > >
> > > > Brian Lamb
> > > >
> > > > On Wed, May 25, 2011 at 4:53 PM, Brian Lamb
> > > > <brian.lamb@journalexperts.com>wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I'm running into some confusion with the way edgengram works. I have
> > the
> > > >> field set up as:
> > > >>
> > > >> <fieldType name="edgengram" class="solr.TextField"
> > > >> positionIncrementGap="1000">
> > > >>    <analyzer>
> > > >>      <tokenizer class="solr.LowerCaseTokenizerFactory" />
> > > >>        <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> > > >> maxGramSize="100" side="front" />
> > > >>    </analyzer>
> > > >> </fieldType>
> > > >>
> > > >> I've also set up my own similarity class that returns 1 as the idf
> > > score.
> > > >> What I've found this does is if I match a string "abcdefg" against
a
> > > field
> > > >> containing "abcdefghijklmnop", then the idf will score that as a 7:
> > > >>
> > > >> 7.0 = idf(myfield: a=51 ab=23 abc=2 abcd=2 abcde=2 abcdef=2
> abcdefg=2)
> > > >>
> > > >> I get why that's happening, but is there a way to avoid that? Do I
> > need
> > > to
> > > >> do a new field type to achieve the desired affect?
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Brian Lamb
> > > >>
> > > >
> > >
> >
>
>
>
> --
> Thanks and Regards,
> DakshinaMurthy BM
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message