lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chantal Ackermann <chantal.ackerm...@btelligent.de>
Subject Re: Autocomplete: match words anywhere in the token
Date Fri, 24 Sep 2010 10:05:43 GMT
Hi Jonathan,

yes it works only for single-valued fields without great effort. For
multivalued fields you'd have to do some extra work getting only the
values wich contain tokens that start with the given prefix.

But maybe you mean also wether it works for several fields in one query.
I guess not, but you can create a new field that contains the values of
the fields that you wish to query for autosuggestions (multivalued or
not depending on whether you use facetting or terms comp.).

I just checked and actually I have such a field, but I use it in
combination with the terms component, while I use the autosuggest based
on facetting in combination with a different single-valued (and
required) field. (I have two different autosuggest sources.)

However, the suggestions based on the terms component are always single
tokens (because of the way my fields are analyzed) - I haven't put any
effort into changing that because I'm not completely convinced that this
source of suggestions is good in my case here. There are far too many
tokens to suggest from and it all seems very arbitrary. The use case of
autosuggest I have in mind, though, is  that of a really long dropdown
box (although of course all entries never show up at once) that offers
complex suggestions (phrases) that really denote some product or person
or other defined objects. And I achieved that with the other
autocomplete based on facets pretty well.

I definitely need to have a look at how to use facetting in combination
with multivalued fields for autocomplete.

Cheers,
Chantal

On Thu, 2010-09-23 at 22:20 +0200, Jonathan Rochkind wrote:
> This works with _one_ entry per document, right?   If you've actually 
> found a clever trick to use this technique when you have more than one 
> entry for auto-suggest per document, do let me know.  Cause I haven't 
> been able to come with one.
> 
> Jonathan
> 
> Chantal Ackermann wrote:
> > What works very good for me:
> >
> > 1.) Keep the tokenized field (KeywordTokenizerFilter,
> > WordDelimiterFilter) (like you described you had)
> > 2.) create an additional field that stores uses the String type with the
> > same content (use copy field to fill either)
> > 3.) use facet.prefix instead of terms.prefix for searching the
> > suggestions
> > 4.) to your query add also the String field as a facet, and return the
> > results from that field as suggestion list. They will include the
> > complete String "canon pixma mp500" for example. The other field can
> > only return facets based on tokens. You probably never want that as
> > facets.
> >
> > So your query was alright and the "canon" (2) facet count probably is
> > the two occurrences that you listed, but as the field was tokenized,
> > only tokens would be returned as facets. You need to have an additional
> > field of pure String type to get the complete value as a facet back.
> >
> > In general, it worked out fine for me to create String fields as return
> > values for facets while using the tokenized fields for searching and the
> > actual facet queries.
> >
> > Cheers,
> > Chantal
> >
> >
> > On Wed, 2010-09-22 at 16:39 +0200, Jason Rutherglen wrote:
> >   
> >> This may be what you're looking for.
> >> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
> >>
> >> On Wed, Sep 22, 2010 at 4:41 AM, Arunkumar Ayyavu
> >> <arunkumar.ayyavu@gmail.com> wrote:
> >>     
> >>> It's been over a week since I started learning Solr. Now, I'm using the
> >>> electronics store example to explore the autocomplete feature in Solr.
> >>>
> >>> When I send the query terms.fl=name&terms.prefix=canon to terms request
> >>> handler, I get the following response
> >>> <lst name="terms">
> >>>  <lst name="name">
> >>>   <int name="canon">2</int>
> >>>  </lst>
> >>> </lst>
> >>>
> >>> But I expect the following results in the response.
> >>> canon pixma mp500 all-in-one photo printer
> >>> canon powershot sd500
> >>>
> >>> So, I changed the schema for textgen fieldType to use
> >>> KeywordTokenizerFactory and also removed WordDelimiterFilterFactory. That
> >>> gives me the expected result.
> >>>
> >>> Now, I also want the Solr to return "canon pixma mp500 all-in-one photo
> >>> printer"  when I send the query terms.fl=name&terms.prefix=pixma. Could
you
> >>> gurus help me get the expected result?
> >>>
> >>> BTW, I couldn't quite understand the behavior of terms.lower and terms.upper
> >>> (I tried these with the electronics store example). Could you also help
me
> >>> understand these 2 query fields?
> >>> Thanks.
> >>>
> >>> --
> >>> Arun
> >>>
> >>>       
> >
> >
> >



Mime
View raw message