lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dror Matalon <d...@zapatec.com>
Subject Re: Ways to search indexes
Date Wed, 03 Dec 2003 19:45:31 GMT
On Wed, Dec 03, 2003 at 02:49:12PM +0000, jt oob wrote:
>  --- Dror Matalon <dror@zapatec.com> wrote: > On Tue, Dec 02, 2003 at
> 01:54:58PM +0000, jt oob wrote:
> > > Hi,
> > > 
> > > I have just indexed a lot of news (nntp) postings.
> > > I now have an index for each topic (a topic can have many
> > newsgroups)
> > > 
> > > The index sizes are:
> > > 
> > > 2.6G Current Affairs
> > > 2.4G Celebs
> > > 119M Recreation
> > > 3.0M Tech - Mac
> > > 2.4G Tech - Windows
> > > 936M Tech - Linux
> > > 702M Tech - Other
> > >  96M Tech - Consoles
> > 
> > Around 15 gigs. How many days of news?
> 
> Not sure how many days, but it's around 5 million postings.

So each posting is roughly 3K. More than I would have thought, but not
too surprising. 
The main reason I asked about how many days, is to get the sense of
growth. 15 Gig is a big index, but to understand the performance
repercussions the rate of growth is equally important. I suspect that by
the time you hit 100 gigs, you'll have one of the biggest indexes around
and you'll have to throw quite heavy hardware or distribute the load to 
get reasonable performance.

> 
> ________________________________________________________________________
> Download Yahoo! Messenger now for a chance to win Live At Knebworth DVDs
> http://www.yahoo.co.uk/robbiewilliams
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.fastbuzz.com
http://www.zapatec.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message