lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernhard Messer <bernhard.mes...@intrafind.de>
Subject Re: English and French documents together / analysis, indexing, searching
Date Thu, 20 Jan 2005 18:05:12 GMT
i think the easiest way ist to use Lucene's StandardAnalyzer. If you 
want to use the snowball stemmers, you have to add a language guesser to 
get the language for the particular document before creating the analyzer.

regards
Bernhard

amigo@max3d.com schrieb:

> Greetings everyone
>
> I wonder is there a solution for analyzing both English and French 
> documents using the same analyzer.
> Reason being is that we have predominantly English documents but there 
> are some French, yet it all has to go into the same index
> and be searchable from the same location during any perticular search. 
> Is there a way to analyze both types of documents with
> a same analyzer (and which one)?
>
> I've looked around and I see there's a SnowBall analyzer but you have 
> to specify the language of analysis, and I do not know that
> ahead of time during indexing nor do I know it most of the time during 
> searching (users would like to search in both document types).
>
> There's also the issue of letter accents in french words and searching 
> for the same (how are they indexed at the first place even)?
> Has anyone dealt with this before and how did you solve the problem?
>
> thanks
>
> -pedja
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message