lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@syr.edu>
Subject Re: Subject indexing and seraching documents with multiple languages
Date Mon, 08 May 2006 12:34:18 GMT
We wrote our own MultiSearcher type class that manages this problem.  It 
takes in a query in the user's native language and then feeds it to the 
searcher for that language, which uses a machine translation component 
to create a query for that index using that language's Analyzer.

-Grant

pbatcoi@gmx.net wrote:
> Hello,
>
> we need to index and search documents of multiple languages. 
>
> Our current approach is:
>
> Determine the language of each document before passing it to Lucene and use
> a Lucene index for each language. This seems to be necessary because the
> IndexWriter takes an analyzer as parameter. Thus we can pass the English
> documents to the IndexWriter created with the English analyzer and so on.
>
> Our problem is the search: We would like to be able to search in only one or
> all language specific indexes. Not a problem itself, because we can use the
> MultiSearcher. But the MultiSearcher takes one query as parameter and the
> query is generated using an analyzer. We would need to generate different
> analyzed queries for the different indexes.
>
> Did somebody find a solution for this problem and can point us a direction
> to investigate further?
>
> Greetings 
>
> Peter and Stefan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message