lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Breu <Michael.B...@arctis.at>
Subject Re: real infix suggester, not AnalyzingInfixSuggester
Date Mon, 27 Oct 2014 13:15:25 GMT
Hello Michael,

Thank you for your kind support.

I had a look into the elasticsearch-analysis-decompound and tried to
integration. However it seemed to me that it is somewhat hard to
integrate it into our work based on lucene-core.

 I have manged to set up a test environment, however I was not
successful in decomposing even simplest words. It seems to need some
base vocabulary setup. But it is not very clear to me, whether the
provided settings are even applicable to German. So finally I gave up.
Maybe a real infix lookup is simpler to manager.

Michael

> Michael Sokolov <mailto:msokolov@safaribooksonline.com>
> Montag, 27. Oktober 2014 12:22
> Have you considered combining the AnalyzingInfixSuggester with a
> German decompounding filter?  If you break compound words into their
> constituent parts during analysis, then the suggester will be able to
> do what you want (prefix matches on the word-parts).  I found this
> project with a quick google search:
> https://github.com/jprante/elasticsearch-analysis-decompound; I don't
> know how good it is or whether it fits with your environment, but it
> could be a start.
>
> -Mike
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> Michael Breu <mailto:Michael.Breu@arctis.at>
> Montag, 27. Oktober 2014 11:34
> Hello,
>
> I'm looking for an infix suggester that allows infix search for a given
> term. This might not be that important in English.
> However in German we have quite complex composite words like
> Donaudampfschifffahrtsgesellschaftskapitän
> which is composed by the nouns Donau (danube), Dampf (steam), schiff
> (boat), etc.
>
> So I would like to support searches like *schiff* to suggest
> Donaudampfschifffahrtsgesellschaft.
>
> I have mistakenly tried for the AnalyzingInfixSuggester, however this
> does not do what I expect, because it does prefix matches to tokens, but
> no infix matches.
>
> I tried to adapt the AnalyzingSuggester, however it seemed to complex
> for an easy conversion to an infix suggester.
>
> I know that this was already asked by
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201103.mbox/%3C1301054307585-2729996.post@n3.nabble.com%3E,
> however, nobody answered this post as far as I know.
>
> Thank you for your help
>
> Wallenstein
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

Mime
View raw message