lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karsten Konrad" <Karsten.Kon...@xtramind.com>
Subject AW: How to acces informations from a part of the index
Date Fri, 09 Jul 2004 09:28:56 GMT

Hi,

Why don't you just use two indexes? You probably do not hate to index the test set at all.

If you have two or more subsets, just use filters that only matches the subsets you are interested
in. Counting documents and such that do contain a certain term in one of the subset becomes
then a search over the filtered document index and counting the number of results. Filters
are quite
efficient.

Hope this helps,

Karsten


--

Dr.-Ing. Karsten Konrad
Head of Artificial Intelligence Lab

Xtramind Technologies GmbH 
Stuhlsatzenhausweg 3 
D-66123 Saarbr├╝cken

Phone +49 (681) 3 02-51 13 
Fax +49 (681) 3 02-51 09
karsten.konrad@xtramind.com 
www.xtramind.com




-----Urspr├╝ngliche Nachricht-----
Von: clibois@student.info.ucl.ac.be [mailto:clibois@student.info.ucl.ac.be] 
Gesendet: Freitag, 9. Juli 2004 11:22
An: lucene-user@jakarta.apache.org
Betreff: How to acces informations from a part of the index


Hello,
for my thesis I have to use Lucene index for a Text categorization program. For that I need
to split the index in two. So i have a learning set and a 
validation set. The problem is that I don't know how to ask lucene to give 
me,for exemple, the number of documents IN ONLY ONE of these subsets 
containing a specific term.
For example, I would to get number of document containing term "hello" in a 
subset of document. This subset is a set of the document number({5,3} and the 
complete index would contains document {0,1,2,3,4,5})
How can I do this in an efficient way?
I tried to get all document containing the term and then verify which document 
belong to my subset. However, it appears that it's very slow to do this. Thanks in advance
Claude Libois


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message