lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Schon <aaron_sc...@yahoo.com>
Subject Re: lucene suiteable ? 6 mio recods / day 1k
Date Fri, 19 Dec 2008 23:10:01 GMT
Christian, 

I do not have an answer for you (hope some of the gurus on this board can provide you an appropriate
answer.
However, I would request you share your finding and experience on this list. 

We are facing a similar situation and would appreciate if you shared your learning.

Regards
AS



----- Original Message ----
From: Christian Brennsteiner <eingfoan@yahoo.de>
To: java-user@lucene.apache.org
Sent: Friday, December 19, 2008 6:22:40 AM
Subject: lucene suiteable ? 6 mio recods / day 1k

hi *,

i am searching for a fulltext index capeable of the following requirements:

index everyday 3 000 000 new records with a validity of N days (e.g.
90 days expiration)
== 34,7 / s
one record is e.g. an url and can be up to 2 k big

http://example.com/somedir/some.html

lucene should use "/" as a word seperator and should e.g. eliminate all ":"

so the following "sentence" shoule be indexed:

http example.com somedir some.html when having the url
http://example.com/somedir/some.html

my main concern about this requirement is that the index should not
grow over time as it always holds
NR OF DAYS * RECORDS PER DAY  and expires the records after a given
time. in my opinione ther must be some background thread always
throwing away expired hits.

is this easilly possible with lucene?

regards chris

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message