lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <benedetti.ale...@gmail.com>
Subject Re: Suggester duplicating values
Date Thu, 02 Jul 2015 15:42:55 GMT
That is what I was saying :)
Hope it helps

2015-07-02 16:32 GMT+01:00 Rafael <rafael.manoel@gmail.com>:

> Just double checking:
>
> In my ruby backend I ask for (using the given example) all suggested terms
> that starts with "J." , then I (probably) add all the terms to a Set, and
> then return the Set to the view. Right ?
>
> []'s
> Rafael
>
> On Thu, Jul 2, 2015 at 12:12 PM, Alessandro Benedetti <
> benedetti.alex85@gmail.com> wrote:
>
> > No, I was referring to the fact that a Suggester as a unit of information
> > manages simple terms which are identified simply by themselves.
> >
> > What you need to do is tu sums some Ruby Datastructure that prevent the
> > duplicates to be inserted, and then offer the Suggestions from there.
> >
> > Cheers
> >
> > 2015-07-02 15:42 GMT+01:00 Rafael <rafael.manoel@gmail.com>:
> >
> > > Thanks, Alessandro!
> > >
> > > Well, I'm using Ruby and the r-solr as a client library. I didn't get
> > what
> > > you said about term id. Do I have to create this field ? Or is it a
> > "hidden
> > > field" utilized by solr under the hood ?
> > >
> > > []'s
> > > Rafael
> > >
> > > On Thu, Jul 2, 2015 at 6:41 AM, Alessandro Benedetti <
> > > benedetti.alex85@gmail.com> wrote:
> > >
> > > > Hi Rafael,
> > > > Your problem is clear and it has actually been explored few times in
> > the
> > > > past.
> > > > I agree with you in a first instance.
> > > >
> > > > A Suggester basic unit of information is a term. Not a document.
> > > > This means that actually it does not make a lot of sense to return
> > > > duplicates terms ( because they are coming from different docs).
> > > > The term id should be the term itself as there is no way for a human
> to
> > > > perceive any difference between two different terms returned by the
> > > > Suggester.
> > > >
> > > > So, this consideration apart, are you using an intermediate API to
> > query
> > > > Solr ( you should definitely do) .
> > > > If you are using any client, your client language should provide you
> a
> > > data
> > > > structure implementation to use to avoid duplicates.
> > > > Java for example is giving you HashSet , TreeSet and all the related
> > > > classes.
> > > >
> > > > Hope this helps,
> > > >
> > > > Cheers
> > > >
> > > > 2015-07-01 18:40 GMT+01:00 Rafael <rafael.manoel@gmail.com>:
> > > >
> > > > > Hi, I'm building a autocomplete solution on top of Solr for an
> ebook
> > > > > seller, but my database is complete denormalized, for example, I
> have
> > > > this
> > > > > kind of records:
> > > > >
> > > > > *author           | title                       | price*
> > > > > -----------------+-----------------------------+---------
> > > > > J. R. R. Tolkien | Lord of the Rings           | $10.0
> > > > > J. R. R. Tolkien | Lord of the Rings Vol. 3    | $12.0
> > > > > J. R. R. Tolkien | Lord of the Rings           | $11.0
> > > > > J. R. R. Tolkien | Lord of the Rings Vol. 3    | $7.5
> > > > > J. R. R. Tolkien | Lord of the Rings Hardcover | $30.5
> > > > >
> > > > > ****We are already spending effort to normalize the database, but
> it
> > > will
> > > > > take a while*
> > > > >
> > > > >
> > > > > Thus, when I try to implement a suggest on author field, for
> example,
> > > if
> > > > I
> > > > > type "*J.*" I'd get "*J. R. R. Tolkien*" 4 times.
> > > > >
> > > > > My Suggester Configuration is pretty standard:
> > > > >
> > > > > <!-- schema -->
> > > > >     <fieldType name="textSuggest" class="solr.TextField"
> > > > > positionIncrementGap="100">
> > > > >       <analyzer type="index">
> > > > >         <tokenizer class="solr.KeywordTokenizerFactory"/>
> > > > >         <filter class="solr.LowerCaseFilterFactory"/>
> > > > >       </analyzer>
> > > > >       <analyzer type="query">
> > > > >         <tokenizer class="solr.KeywordTokenizerFactory"/>
> > > > >         <filter class="solr.LowerCaseFilterFactory"/>
> > > > >       </analyzer>
> > > > >     </fieldType>
> > > > >
> > > > >
> > > > > <!-- Solrconfig -->
> > > > >   <searchComponent name="suggest" class="solr.SuggestComponent">
> > > > >         <lst name="suggester">
> > > > >       <str name="name">mySuggester</str>
> > > > >       <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
> > > > >       <str name="dictionaryImpl">DocumentDictionaryFactory</str>
> > > > >       <str name="field">author</str>
> > > > >       <str name="suggestAnalyzerFieldType">textSuggest</str>
> > > > >     </lst>
> > > > >   </searchComponent>
> > > > >
> > > > >   <requestHandler name="/suggest" class="solr.SearchHandler"
> > > > > startup="lazy">
> > > > >     <lst name="defaults">
> > > > >       <str name="suggest">true</str>
> > > > >       <str name="suggest.count">20</str>
> > > > >       <str name="suggest.dictionary">mySuggester</str>
> > > > >     </lst>
> > > > >     <arr name="components">
> > > > >       <str>suggest</str>
> > > > >     </arr>
> > > > >   </requestHandler>
> > > > >
> > > > >
> > > > > And I'm using Solr 5.2.1.
> > > > >
> > > > > *Question:* Is there a way to get only unique values for
> suggestion ?
> > > Or,
> > > > > would be simpler to export a file (or even a nem table in database)
> > > > without
> > > > > duplicated values ?
> > > > >
> > > > > Thanks.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > --------------------------
> > > >
> > > > Benedetti Alessandro
> > > > Visiting card : http://about.me/alessandro_benedetti
> > > >
> > > > "Tyger, tyger burning bright
> > > > In the forests of the night,
> > > > What immortal hand or eye
> > > > Could frame thy fearful symmetry?"
> > > >
> > > > William Blake - Songs of Experience -1794 England
> > > >
> > >
> >
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message