lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fred Rahmanian <>
Subject Indexing and search questions
Date Tue, 20 Apr 2010 20:43:17 GMT
I'd like to use lucene to search text documents for the existence of a large
list of search terms. I have a file that contains thousands of entries, one
word per line. I was thinking about to writing a specialized analyzer
that tokenizes the document by  looking up each token in the source document
in my list of words and return terms for words that exist in my file. I'm
hoping that using this approach the index file will contain only items that
exist in my document. So once the index is created I should be able to ask
the index for all of its terms and whatever is returned is the list of items
I'm interested in.

I'm new to Lucene so I'm not sure if I'm going about this the right way.
lastly, I want to be able run this process for thousands of documents and
store the matches ( and their offset ) in a db. So it should be fairly

I appreciate any comments.



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message