lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Audrey Lorberfeld - Audrey.Lorberfeld@ibm.com" <Audrey.Lorberf...@ibm.com>
Subject Re: Re: Anyone have experience with Query Auto-Suggestor?
Date Sun, 26 Jan 2020 14:03:25 GMT
Oh, great! Thank you, this is helpful!

On 1/24/20, 6:43 PM, "Walter Underwood" <wunder@wunderwood.org> wrote:

    Click-based weights are vulnerable to spamming. Some of us fondly remember when
    Google was showing Microsoft as the first hit for “evil empire” thanks to a click
attack.
    
    For our ecommerce search, we use the actual titles of books weighted by order volume.
    Decorated titles are reduced to a base title, so “Managerial Accounting: Student Value
Edition”
    becomes just “Managerial Accounting”. Showing all the variations is the job of the

    real results page.
    
    wunder
    Walter Underwood
    wunder@wunderwood.org
    https://urldefense.proofpoint.com/v2/url?u=http-3A__observer.wunderwood.org_&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M&m=3oEhRJWEHDoz3HXt87Y_FXxPTUZg1zSA5r4P6urviug&s=87IOY_vKNONtR2r2IkW-NnZ4Rn3wI-OIO6RSdqdOMfU&e=
  (my blog)
    
    > On Jan 24, 2020, at 7:07 AM, Lucky Sharma <goku0910@gmail.com> wrote:
    > 
    > Hi Audrey,
    > As suggested by Erik, you can index the data into a seperate collection and
    > You can instead of adding weights inthe document you can also use
    > LTR(Learning to Rank) with in Solr to rerank on the documents.
    > And also to increase more relevance with in the Autosuggestion and making
    > positional context of the user in case of Multi token keywords you can also
    > bigrams/trigrams to generate edge n-grams.
    > 
    > 
    > 
    > Regards,
    > Lucky Sharma
    > 
    > On Fri, 24 Jan, 2020, 8:28 pm Lucky Sharma, <goku0910@gmail.com> wrote:
    > 
    >> Hi Audrey,
    >> As suggested by Erik, you can index the data into a seperate collection
    >> and You can instead of adding weights inthe document you can also use LTR
    >> with in Solr to rerank on the features.
    >> 
    >> Regards,
    >> Lucky Sharma
    >> 
    >> On Fri, 24 Jan, 2020, 8:01 pm Audrey Lorberfeld -
    >> Audrey.Lorberfeld@ibm.com, <Audrey.Lorberfeld@ibm.com> wrote:
    >> 
    >>> Erik,
    >>> 
    >>> Thank you! Yes, that's exactly how we were thinking of architecting it.
    >>> And our ML engineer suggested something else for the suggestion weights,
    >>> actually -- to build a model that would programmatically update the weights
    >>> based on those suggestions' live clicks @ position k, etc. Pretty cool
    >>> idea...
    >>> 
    >>> 
    >>> 
    >>> On 1/23/20, 2:26 PM, "Erik Hatcher" <erik.hatcher@gmail.com> wrote:
    >>> 
    >>>    It's a great idea.   And then index that file into a separate lean
    >>> collection of just the suggestions, along with the weight as another field
    >>> on those documents, to use for ranking them at query time with standard
    >>> /select queries.  (this separate suggest collection would also have
    >>> appropriate tokenization to match the partial words as the user types, like
    >>> ngramming)
    >>> 
    >>>        Erik
    >>> 
    >>> 
    >>>> On Jan 20, 2020, at 11:54 AM, Audrey Lorberfeld -
    >>> Audrey.Lorberfeld@ibm.com <Audrey.Lorberfeld@ibm.com> wrote:
    >>>> 
    >>>> David,
    >>>> 
    >>>> Thank you, that is useful. So, would you recommend using a (clean)
    >>> field over an external dictionary file? We have lots of "top queries" and
    >>> measure their nDCG. A thought was to programmatically generate an external
    >>> file where the weight per query term (or phrase) == its nDCG. Bad idea?
    >>>> 
    >>>> Best,
    >>>> Audrey
    >>>> 
    >>>> On 1/20/20, 11:51 AM, "David Hastings" <
    >>> hastings.recursive@gmail.com> wrote:
    >>>> 
    >>>>   Ive used this quite a bit, my biggest piece of advice is to
    >>> choose a field
    >>>>   that you know is clean, with well defined terms/words, you dont
    >>> want an
    >>>>   autocomplete that has a massive dictionary, also it will make the
    >>>>   start/reload times pretty slow
    >>>> 
    >>>>   On Mon, Jan 20, 2020 at 11:47 AM Audrey Lorberfeld -
    >>>>   Audrey.Lorberfeld@ibm.com <Audrey.Lorberfeld@ibm.com> wrote:
    >>>> 
    >>>>> Hi All,
    >>>>> 
    >>>>> We plan to incorporate a query autocomplete functionality into our
    >>> search
    >>>>> engine (like this:
    >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_solr_guide_8-5F1_suggester.html&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M&m=L8V-izaMW_v4j-1zvfiXSqm6aAoaRtk-VJXA6okBs_U&s=vnE9KGyF3jky9fSi22XUJEEbKLM1CA7mWAKrl2qhKC0&e=
    >>>>> ). And I was wondering if anyone has personal experience with this
    >>>>> component and would like to share? Basically, we are just looking
    >>> for some
    >>>>> best practices from more experienced Solr admins so that we have
a
    >>> starting
    >>>>> place to launch this in our beta.
    >>>>> 
    >>>>> Thank you!
    >>>>> 
    >>>>> Best,
    >>>>> Audrey
    >>>>> 
    >>>> 
    >>>> 
    >>> 
    >>> 
    >>> 
    >>> 
    
    

Mime
View raw message