lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dror Matalon <d...@zapatec.com>
Subject Re: The best way forward
Date Tue, 04 Nov 2003 18:33:59 GMT
Hi,

By the way, we're also thinking of integrating newsgroups into RSS
aggregator which you can see at  www.fastbuzz.com. 

Are you interested in comparing notes, or possibly pooling resources? We
have plenty of technical resources, and we've run news servers before,
although it's been a few years.

Regards,

Dror

On Tue, Nov 04, 2003 at 11:09:27AM +0000, jt oob wrote:
> Thank you for the replies!
> 
> My indexes are currently looking like they might be 12GB when finished
> on the current run.
> 
> I have spotted a tool on the lucene site for listing the most
> frequently occuring words in the index. Currently I am using the
> defaultAnalyzer  stoplist, I should probably use a more comprehensive
> list.
> 
> Is there a way of implementing a stoplist after the index has been
> created,  removing all occurances of the new stoplist words?
> I could then write a new Analyzer with the new stoplist for adding new
> documents to the index.
> Am i doomed to reindexing with a better stoplist?
> 
> 
> In view of the index size, I am going to see how well the kernel
> caching performs, as the index probably won't fit entirely into memory
> once the operating system and other system processes have taken their
> bite of the available memory.
> 
> Eventually i am going to try to implement something similar to google
> groups, indexing lots of NNTP traffic. Has anyone done this before with
> lucune?
> 
> 
> Thanks again,
> jt
> 
> ________________________________________________________________________
> Want to chat instantly with your online friends?  Get the FREE Yahoo!
> Messenger http://mail.messenger.yahoo.co.uk
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.fastbuzz.com
http://www.zapatec.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message