lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Boost term according to phonetic representation
Date Tue, 31 Jan 2012 09:50:23 GMT
If all you have indexed are three identical terms there will be no way
to make Markus come top.

You could index the normalized version and the original (with maybe
StandardAnalyzer to get downcasing etc) and do a search across both
fields, boosting whichever makes sense for you.

normalized:Markus original:Markus^2

You could use PerFieldAnalyzerWrapper to specify different analyzers
for different fields.


Another idea might be to change your normalization analyzer to output
the original as well as the normalized version.  Sort of like
synonyms.  So, if the normalized version of all three was "marcus"
you'd end up with indexed terms like

marcus marcus
marcus markus
marcus mharcus

and at search time "Markus" would expand to "marcus markus" and the
second doc would come top.


--
Ian.


On Mon, Jan 30, 2012 at 10:35 PM, Felipe Carvalho
<felipe.carvalho@gmail.com> wrote:
> Consider a people index, containing People documents with the following
> names:
>
> Doc 1 [name: "Marcus"]
> Doc 2 [name: "Markus"]
> Doc 3 [name: "Mharcus"]
>
> Suppose I use an analyzer so that all 3 names have the same representation.
> Supposing I use the same analyzer when running a search for name=markus, is
> there a way to make Markus appear on top of the others?
>
> I was looking at this article (
> http://lucene.apache.org/java/3_5_0/queryparsersyntax.html#Boosting%20a%20Term)
> but I'm not sure it applies to what I need.
>
> Thanks

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message