lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Subject indexing and seraching documents with multiple languages
Date Mon, 08 May 2006 12:34:18 GMT
We wrote our own MultiSearcher type class that manages this problem.  It 
takes in a query in the user's native language and then feeds it to the 
searcher for that language, which uses a machine translation component 
to create a query for that index using that language's Analyzer.

-Grant wrote:
> Hello,
> we need to index and search documents of multiple languages. 
> Our current approach is:
> Determine the language of each document before passing it to Lucene and use
> a Lucene index for each language. This seems to be necessary because the
> IndexWriter takes an analyzer as parameter. Thus we can pass the English
> documents to the IndexWriter created with the English analyzer and so on.
> Our problem is the search: We would like to be able to search in only one or
> all language specific indexes. Not a problem itself, because we can use the
> MultiSearcher. But the MultiSearcher takes one query as parameter and the
> query is generated using an analyzer. We would need to generate different
> analyzed queries for the different indexes.
> Did somebody find a solution for this problem and can point us a direction
> to investigate further?
> Greetings 
> Peter and Stefan
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:


Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 
Voice:  315-443-5484 
Fax: 315-443-6886 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message