lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: Ok to add method IndexWriter.addDocument( Analyzer, Document ) ?
Date Thu, 26 Jun 2003 21:53:58 GMT
Randy Darling wrote:
> Would it be ok to add an extra addDocument method to
> IndexWriter that would take an analyzer in addition to
> the document?
> I am going to be indexing documents for multiple languages
> and I would prefer to not have to reopen a writer for
> each document that we are going to index.
> I took a look at the code and it looks pretty straight forward
> and it didn't look like it would break anything.

I had the same problem, but I came up with a workaround which might be 
helpful to you. I just wrote a facade analyzer, which selects 
appropriate language-specific analyzer just before I call addDocument. 
Something like:

	SwitchLangAnalyzer sla = new SwitchLangAnalyzer(new Analyzer[] 
{GermanAnalyzer, RussianAnalyzer, SwedishAnalyzer});
	IndexWriter iw = new IndexWriter(dir, sla, true);
	// add German doc;
	// add Russian doc;

..and so on...

You need to be extra careful though how you use such index afterwards, 
especially if you use stemming or stop words - I also store a "lang" 
field which I use to limit the search to documents only in a given 
language, and I use the same sub-analyzer for queries.

Best regards,
Andrzej Bialecki

Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
FreeBSD developer (

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message