lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saravana <>
Subject Re: indexing performance
Date Tue, 27 Feb 2007 15:49:44 GMT

I thought of getting the maximum indexing rate by lucene. However I did the
test with sample strings and I am getting close to 600 documents/sec in a
512 MB RAM with 1.9 GHz Linux machine. Searching is pretty fast and I can
create new index files based on user or based on time etc so that I will end
up in searching with multiple index files.

I am trying to index the syslogs generated from one of my busy ftp server so
that I can get counts specific to an user with the given time frame. Since
my  ftp server is very busy it can generate so much syslogs per second. And
the important point here is I do not need any ranking etc..

So anybody faced the above scenario and any suggestions would be helpful.  I
am just trying to get the maximum performance from lucene for the given
hardware then i can purchase the needed hardware.


On 2/27/07, Erick Erickson <> wrote:
> How do you expect anyone to be able to answer such an open-ended
> question? What I'd do is create a test harness that generates a random
> set of strings and try it.
> Off the top of my head, this seems like a pretty steep requirement. And
> at 2,000 docs a second you're going to have a huge index pretty soon
> so you won't be able to do this for very long.
> What are you trying to accomplish anyway?
> Erick
> On 2/27/07, Saravana <> wrote:
> >
> > Hi,
> >
> > Is it possible to scale lucene indexing like 2000/3000 documents per
> > second?  I need to index 10 fields each with 20 bytes long.  I should be
> > able to search by just giving any of the field values as criteria. I
> need
> > to
> > get the count that has same field values.
> >
> > Will it be possible?
> >
> > with regards,
> > MSK
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message