lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Halbert <j...@su3analytics.com>
Subject Re: hit highlighting in lucene ?
Date Thu, 21 May 2009 12:48:10 GMT
The highlighter should be language independent. So long as you are
consistent with your use of Analyzer between
indexing/query/highlighting.

As for the most appropriate Analyzer to use for your local language,
this is a seperate question - especially if you are using stop word and
stemming filters.

The StandardAnalyzer is designed for English since it used the
StopFilter (English words only). 


-----Original Message-----
From: KK <dioxide.software@gmail.com>
Reply-To: java-user@lucene.apache.org
To: java-user@lucene.apache.org
Subject: hit highlighting in lucene ?
Date: Thu, 21 May 2009 17:51:13 +0530

Hi All,
I was looking for various ways of implementing hit highlighting in Lucene
and found some standard classes that does support highlighting like this
*lucene*.apache.org/java/2_2_0/api/org/apache/*lucene*/search/*highlight*
/package-summary.html

ik but what i believe is that this is only for english or does it support
other languages. I actually wanted to support highlighting for some
non-english languages which I'm able to index and fetch using utf-8
encoding. So  this means that if I want to have highlighting then I've to
get the utf-8 query and look for the same in the result and add apt tags
whereever required, it essentially boils down to implementing the standard
highlighter. I think the standard highlighter also supports other languages.
Correct me if i'm wrong.

Due to my requirement constraints I'm using just simpleAnalyzer and we dont
have tokenizers for these regional languages. Any other ideas of doing the
same would be helpful as well.

Thanks,
KK.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message