lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <benedetti.ale...@gmail.com>
Subject Re: Suggester duplicating values
Date Thu, 02 Jul 2015 15:12:56 GMT
No, I was referring to the fact that a Suggester as a unit of information
manages simple terms which are identified simply by themselves.

What you need to do is tu sums some Ruby Datastructure that prevent the
duplicates to be inserted, and then offer the Suggestions from there.

Cheers

2015-07-02 15:42 GMT+01:00 Rafael <rafael.manoel@gmail.com>:

> Thanks, Alessandro!
>
> Well, I'm using Ruby and the r-solr as a client library. I didn't get what
> you said about term id. Do I have to create this field ? Or is it a "hidden
> field" utilized by solr under the hood ?
>
> []'s
> Rafael
>
> On Thu, Jul 2, 2015 at 6:41 AM, Alessandro Benedetti <
> benedetti.alex85@gmail.com> wrote:
>
> > Hi Rafael,
> > Your problem is clear and it has actually been explored few times in the
> > past.
> > I agree with you in a first instance.
> >
> > A Suggester basic unit of information is a term. Not a document.
> > This means that actually it does not make a lot of sense to return
> > duplicates terms ( because they are coming from different docs).
> > The term id should be the term itself as there is no way for a human to
> > perceive any difference between two different terms returned by the
> > Suggester.
> >
> > So, this consideration apart, are you using an intermediate API to query
> > Solr ( you should definitely do) .
> > If you are using any client, your client language should provide you a
> data
> > structure implementation to use to avoid duplicates.
> > Java for example is giving you HashSet , TreeSet and all the related
> > classes.
> >
> > Hope this helps,
> >
> > Cheers
> >
> > 2015-07-01 18:40 GMT+01:00 Rafael <rafael.manoel@gmail.com>:
> >
> > > Hi, I'm building a autocomplete solution on top of Solr for an ebook
> > > seller, but my database is complete denormalized, for example, I have
> > this
> > > kind of records:
> > >
> > > *author           | title                       | price*
> > > -----------------+-----------------------------+---------
> > > J. R. R. Tolkien | Lord of the Rings           | $10.0
> > > J. R. R. Tolkien | Lord of the Rings Vol. 3    | $12.0
> > > J. R. R. Tolkien | Lord of the Rings           | $11.0
> > > J. R. R. Tolkien | Lord of the Rings Vol. 3    | $7.5
> > > J. R. R. Tolkien | Lord of the Rings Hardcover | $30.5
> > >
> > > ****We are already spending effort to normalize the database, but it
> will
> > > take a while*
> > >
> > >
> > > Thus, when I try to implement a suggest on author field, for example,
> if
> > I
> > > type "*J.*" I'd get "*J. R. R. Tolkien*" 4 times.
> > >
> > > My Suggester Configuration is pretty standard:
> > >
> > > <!-- schema -->
> > >     <fieldType name="textSuggest" class="solr.TextField"
> > > positionIncrementGap="100">
> > >       <analyzer type="index">
> > >         <tokenizer class="solr.KeywordTokenizerFactory"/>
> > >         <filter class="solr.LowerCaseFilterFactory"/>
> > >       </analyzer>
> > >       <analyzer type="query">
> > >         <tokenizer class="solr.KeywordTokenizerFactory"/>
> > >         <filter class="solr.LowerCaseFilterFactory"/>
> > >       </analyzer>
> > >     </fieldType>
> > >
> > >
> > > <!-- Solrconfig -->
> > >   <searchComponent name="suggest" class="solr.SuggestComponent">
> > >         <lst name="suggester">
> > >       <str name="name">mySuggester</str>
> > >       <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
> > >       <str name="dictionaryImpl">DocumentDictionaryFactory</str>
> > >       <str name="field">author</str>
> > >       <str name="suggestAnalyzerFieldType">textSuggest</str>
> > >     </lst>
> > >   </searchComponent>
> > >
> > >   <requestHandler name="/suggest" class="solr.SearchHandler"
> > > startup="lazy">
> > >     <lst name="defaults">
> > >       <str name="suggest">true</str>
> > >       <str name="suggest.count">20</str>
> > >       <str name="suggest.dictionary">mySuggester</str>
> > >     </lst>
> > >     <arr name="components">
> > >       <str>suggest</str>
> > >     </arr>
> > >   </requestHandler>
> > >
> > >
> > > And I'm using Solr 5.2.1.
> > >
> > > *Question:* Is there a way to get only unique values for suggestion ?
> Or,
> > > would be simpler to export a file (or even a nem table in database)
> > without
> > > duplicated values ?
> > >
> > > Thanks.
> > >
> >
> >
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message