lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Retrieve term payloads / custom PayloadFilter
Date Thu, 08 Jul 2010 14:28:46 GMT
If you know this at index time, could you index language-specific fields?
i.e.
text_en, text_de, title_en, title_de etc? Perhaps you could have a catch-all
that contained everything too.

Then your searching would be on a per field_lang basis.
PerFieldAnalyzerWrapper
would automatically use the proper language-specific Analyzers.

This may turn out being too clumsy if you have many fields X many
languages....

Actually, this looks a lot like what SOLR could provide, perhaps with
dynamic
fields and the dismax query parser

Best
Erick

On Thu, Jul 8, 2010 at 4:47 AM, Bernhard Haslhofer <
bernhard.haslhofer@univie.ac.at> wrote:

> Hi,
>
> in my application I have documents that may contain terms and term
> translations in multiple languages. The language tag of each term is
> explicitly given and should be available in the index in order to enable
> queries for documents that contain a certain term (optionally in a given
> language).
>
> I could split the documents in a set of sub-documents each containing terms
> in one specific language and a dedicated field indicating the language. But
> then I need multiple queries to retrieve stored term translations from the
> subdocuments.
>
> The IMO better alternative is not to split the document and to assign the
> language tags as payloads to the terms. But then I need
>
> (i) a search filter that eliminates docs based on a given language tag and
>
> (ii) a way to access the term payloads from the documents returned by the
> searcher
>
> For both I haven't found a solution yet. Can I write a custom PayloadFilter
> or is there already some implementation available? Is it possible to access
> the term payloads from the search results?
>
> Thanks.
> Bernhard
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message