lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Facet on a Payload field type?
Date Thu, 31 Aug 2017 19:54:44 GMT

if the "middle tier" of your application doesn't already have an easy 
key-value lookup that you keep this translation data in (which would 
suprise me, because i've never seen anyone care about this type of 
"late-binding" translation of serach results w/o also caring about 
late-binding translation of other aspects of the UI) then you could always 
create a side car collection in solr: one document per "word" using the 
english term as the id, with a lowercased copy field for searching + one
field per langauge with the trnaslations if available.

after doing your main query, toss all the facet.field terms in the 
response into a second query to your side car "translation" collection 
using the "terms" parser and setting the "rows" == the total number of 
terms you're asking to tnraslate and fl=id,fr (or fl=id,es ... whatever 
language the user wants)

https://lucene.apache.org/solr/guide/6_6/other-parsers.html

...then use those results to translate the final output.


: Date: Thu, 31 Aug 2017 14:12:38 -0500
: From: Webster Homer <webster.homer@sial.com>
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Re: Facet on a Payload field type?
: 
: You are describing the idea pretty accurately. Apparently Endeca has
: something that sort of supports this, which we used for the problem.
: 
: On Thu, Aug 31, 2017 at 1:59 PM, Chris Hostetter <hossman_lucene@fucit.org>
: wrote:
: 
: >
: > ok, so lemme attempt to restate your objective to ensure no
: > miscommunication:
: >
: > 1) you have fields like "color"
: > 2) you want to index english words into the color field for docs
: > 3) you want to search/filter against these fields using english words as
: > input
: > 4) you want to facet on the fields like "color"
: > 5) you want the list of terms:counts displayed to the end user when
: > faceting on these fields to be in a variety of different langauges, based
: > on a "user_lang" option specified at query time and a set of known
: > translations
: > 6) if no english->user_lang translation is available for a particular
: > term, you want to display the eglish workd when displaying the facet
: > counts
: >
: > does that sound right?
: >
: > based on your objective, attempting to embed/encode the various
: > translations into the terms when indexing (as payloads, or an
: > alternative field or prefixed terms, etc...) seems like a vastly
: > overcomplicated way to deal with this problem.
: >
: > If i were in your shoes, i would keep the translation aspect of the
: > displya completely distinct from Solr, and after solr has returned the
: > response then loop over the facet.field temrs and do a lookup in some
: > other (cached) key/value translation mapping in your middle layer --
: > replacing the english word with the translation if it exists.
: >
: > This has the added benefit of allowing you to tweak the translations w/o
: > reindexing any docs.
: >
: > Practically speaking: the idea of encoding these translations as payloads
: > wouldn't make sense -- because payloads exist per *occurance* of the term
: > -- ie: it wouldn't make sense to put "es=rojo;fr=rouge" in the payload of
: > a term "red" when indexing a document, because you want those translations
: > for all instances of red -- not just that instance of red in that
: > singlular document.
: >
: >
: >
: > : Date: Mon, 28 Aug 2017 13:29:00 -0500
: > : From: Webster Homer <webster.homer@sial.com>
: > : Reply-To: solr-user@lucene.apache.org
: > : To: solr-user@lucene.apache.org
: > : Subject: Re: Facet on a Payload field type?
: > :
: > : The issue is, that we lack translations for much of our attribute data.
: > We
: > : do have English versions. The idea is to use the English values for the
: > : faceted values and for the filters, but be able to retrieve different
: > : language versions of the term to the caller.
: > : If we have a facet on color if the value is red, be able to retrieve rojo
: > : for Spanish etc...
: > :
: > : Also users can switch regions between searches. If a user starts out in
: > : French, executes a search, selects a facet then switches to German they
: > : should get the German for the facet (if it exists) even when they
: > : originally used French. If all of the searching was in English where we
: > : have the data, we could then show French (or German etc) for the facet
: > : value.
: > :
: > : The real field value that we use for filtering would be in English but
: > the
: > : values returned to the user would be in the language of their locale or
: > : English if we don't have a translation for it. The idea being that the
: > : translations would be stored in the payloads
: > :
: > : On Wed, Aug 23, 2017 at 7:47 PM, Chris Hostetter <
: > hossman_lucene@fucit.org>
: > : wrote:
: > :
: > : >
: > : > : The payload idea was from my boss, it's similar to how they did this
: > in
: > : > : Endeca.
: > : >         ...
: > : > : My alternate idea is to have sets of facet fields for different
: > : > languages,
: > : > : then let our service layer determine the correct one for the user's
: > : > : language, but I'm curious as to how others have solved this.
: > : >
: > : > Let's back up for a minute -- can you please explain your ultimate
: > goal,
: > : > from a "solr client application" perspective? (assuming we have no
: > : > knowledge of how/how you might have used Endeca in the past)
: > : >
: > : > What is it you want your application to be able to do when indexing
: > docs
: > : > to solr and making queries to solr?  give us some real world examples
: > : >
: > : >
: > : >
: > : > (If i had to guess: i gather maybe you're just dealing with a
: > "keywords"
: > : > type field that you want to facet on -- and maybe you could use a diff
: > : > field for each langauge, or encode the langauges as a prefix on each
: > term
: > : > and use facet.prefix to restrict the facet contraints returned)
: > : >
: > : >
: > : >
: > : > https://people.apache.org/~hossman/#xyproblem
: > : > XY Problem
: > : >
: > : > Your question appears to be an "XY Problem" ... that is: you are
: > dealing
: > : > with "X", you are assuming "Y" will help you, and you are asking about
: > "Y"
: > : > without giving more details about the "X" so that we can understand the
: > : > full issue.  Perhaps the best solution doesn't involve "Y" at all?
: > : > See Also: http://www.perlmonks.org/index.pl?node_id=542341
: > : >
: > : >
: > : >
: > : > :
: > : > : On Wed, Aug 23, 2017 at 2:10 PM, Markus Jelsma <
: > : > markus.jelsma@openindex.io>
: > : > : wrote:
: > : > :
: > : > : > Technically they could, facetting is possible on TextField, but it
: > : > would
: > : > : > be useless for facetting. Payloads are only used for scoring via a
: > : > custom
: > : > : > Similarity. Payloads also can only contain one byte of information
: > (or
: > : > was
: > : > : > it 64 bits?)
: > : > : >
: > : > : > Payloads are not something you want to use when dealing with
: > : > translations.
: > : > : > We handle facet constraint (and facet field)  translations just by
: > : > mapping
: > : > : > internal value to a translated value when displaying facet, and
: > vice
: > : > versa
: > : > : > when filtering.
: > : > : >
: > : > : > -----Original message-----
: > : > : > > From:Webster Homer <webster.homer@sial.com>
: > : > : > > Sent: Wednesday 23rd August 2017 20:22
: > : > : > > To: solr-user@lucene.apache.org
: > : > : > > Subject: Facet on a Payload field type?
: > : > : > >
: > : > : > > Is it possible to facet on  a payload field type?
: > : > : > >
: > : > : > > We are moving from Endeca to Solr. We have a number of Endeca
: > facets
: > : > : > where
: > : > : > > we have hacked in multilangauge support. The multiple languages
: > are
: > : > : > really
: > : > : > > just for displaying the value of a term internally the value
: > used to
: > : > : > search
: > : > : > > is in English. The problem is that we don't have translations for
: > : > most of
: > : > : > > our facet data and this was a way to support multiple languages
: > with
: > : > the
: > : > : > > data we have.
: > : > : > >
: > : > : > > Looking at the Solrj FacetField class I cannot tell if the value
: > can
: > : > : > >  contain  a payload or not
: > : > : > >
: > : > : > > --
: > : > : > >
: > : > : > >
: > : > : > > This message and any attachment are confidential and may be
: > : > privileged or
: > : > : > > otherwise protected from disclosure. If you are not the intended
: > : > : > recipient,
: > : > : > > you must not copy this message or attachment or disclose the
: > : > contents to
: > : > : > > any other person. If you have received this transmission in
: > error,
: > : > please
: > : > : > > notify the sender immediately and delete the message and any
: > : > attachment
: > : > : > > from your system. Merck KGaA, Darmstadt, Germany and any of its
: > : > : > > subsidiaries do not accept liability for any omissions or errors
: > in
: > : > this
: > : > : > > message which may arise as a result of E-Mail-transmission or for
: > : > damages
: > : > : > > resulting from any unauthorized changes of the content of this
: > : > message
: > : > : > and
: > : > : > > any attachment thereto. Merck KGaA, Darmstadt, Germany and any
: > of its
: > : > : > > subsidiaries do not guarantee that this message is free of
: > viruses
: > : > and
: > : > : > does
: > : > : > > not accept liability for any damages caused by any virus
: > transmitted
: > : > : > > therewith.
: > : > : > >
: > : > : > > Click http://www.emdgroup.com/disclaimer to access the German,
: > : > French,
: > : > : > > Spanish and Portuguese versions of this disclaimer.
: > : > : > >
: > : > : >
: > : > :
: > : > : --
: > : > :
: > : > :
: > : > : This message and any attachment are confidential and may be
: > privileged or
: > : > : otherwise protected from disclosure. If you are not the intended
: > : > recipient,
: > : > : you must not copy this message or attachment or disclose the
: > contents to
: > : > : any other person. If you have received this transmission in error,
: > please
: > : > : notify the sender immediately and delete the message and any
: > attachment
: > : > : from your system. Merck KGaA, Darmstadt, Germany and any of its
: > : > : subsidiaries do not accept liability for any omissions or errors in
: > this
: > : > : message which may arise as a result of E-Mail-transmission or for
: > damages
: > : > : resulting from any unauthorized changes of the content of this
: > message
: > : > and
: > : > : any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
: > : > : subsidiaries do not guarantee that this message is free of viruses
: > and
: > : > does
: > : > : not accept liability for any damages caused by any virus transmitted
: > : > : therewith.
: > : > :
: > : > : Click http://www.emdgroup.com/disclaimer to access the German,
: > French,
: > : > : Spanish and Portuguese versions of this disclaimer.
: > : > :
: > : >
: > : > -Hoss
: > : > http://www.lucidworks.com/
: > : >
: > :
: > : --
: > :
: > :
: > : This message and any attachment are confidential and may be privileged or
: > : otherwise protected from disclosure. If you are not the intended
: > recipient,
: > : you must not copy this message or attachment or disclose the contents to
: > : any other person. If you have received this transmission in error, please
: > : notify the sender immediately and delete the message and any attachment
: > : from your system. Merck KGaA, Darmstadt, Germany and any of its
: > : subsidiaries do not accept liability for any omissions or errors in this
: > : message which may arise as a result of E-Mail-transmission or for damages
: > : resulting from any unauthorized changes of the content of this message
: > and
: > : any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
: > : subsidiaries do not guarantee that this message is free of viruses and
: > does
: > : not accept liability for any damages caused by any virus transmitted
: > : therewith.
: > :
: > : Click http://www.emdgroup.com/disclaimer to access the German, French,
: > : Spanish and Portuguese versions of this disclaimer.
: > :
: >
: > -Hoss
: > http://www.lucidworks.com/
: >
: 
: -- 
: 
: 
: This message and any attachment are confidential and may be privileged or 
: otherwise protected from disclosure. If you are not the intended recipient, 
: you must not copy this message or attachment or disclose the contents to 
: any other person. If you have received this transmission in error, please 
: notify the sender immediately and delete the message and any attachment 
: from your system. Merck KGaA, Darmstadt, Germany and any of its 
: subsidiaries do not accept liability for any omissions or errors in this 
: message which may arise as a result of E-Mail-transmission or for damages 
: resulting from any unauthorized changes of the content of this message and 
: any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
: subsidiaries do not guarantee that this message is free of viruses and does 
: not accept liability for any damages caused by any virus transmitted 
: therewith.
: 
: Click http://www.emdgroup.com/disclaimer to access the German, French, 
: Spanish and Portuguese versions of this disclaimer.
: 

-Hoss
http://www.lucidworks.com/

Mime
View raw message