httpd-mbox-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Erenkrantz <jus...@erenkrantz.com>
Subject Re: svn commit: r161526 - httpd/mod_mbox/trunk/module-2.0/mod-mbox-util.c httpd/mod_mbox/trunk/module-2.0/mod_mbox_index.c
Date Sat, 16 Apr 2005 07:54:51 GMT
On Sat, Apr 16, 2005 at 12:44:54AM -0700, Paul Querna wrote:
> In the future, mod-mbox-util needs to know what files should be added to 
> a full text index -- that means you don't want incomplete files included 
> in such an index.. You could always have the Apache Module prune those 
> from the search results at runtime, but that seems far more complicated 
> to me.

In the full text index, you would need to provide the document range you want
to retrieve results over.  (As the user doesn't necessarily want to search all
documents in the corpus: they'd want to restrict to a particular list or a
particular collection of lists.)  So, mod_mbox wouldn't ask for the documents
related to those httpd has excluded.  Admittedly, I've only worked with
research-quality search backends; but that's been in all of them.  

...quick look into Lucene...  Aha.

Quick look in the Lucene FAQ and it's here:

http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-884dadb3abdf5c520a84a2328c6d34a98cc2f4f9

So, no, I think it's fine.  =)  You just have the bit fields set to exclude
the documents that httpd has ignored.  -- justin

Mime
View raw message