lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Carlson <carl...@bookandhammer.com>
Subject Re: sanity check - index size
Date Tue, 21 May 2002 00:50:59 GMT
This seems big depending on what you are storing.

For example, I have a set of data with 457MB and my Lucene index is 115MB.
However, I don't store much.

If you are storing the complete text (even if you don't index it), then it
will be about the same size (no probably bigger) than your original data
set.

--Peter

On 5/20/02 4:16 PM, "Erik Hatcher" <lists@ehatchersolutions.com> wrote:

> I'm indexing 900+ files (less than 1,000) that total about 15MB in size.
> These are text files and HTML files.  I only index them into a few fields
> (title, content, filename).  My index (specifically _sd.fdt) is 20MB.  The
> bulk of the HTML files are Javadoc files (Ant's own documentation,
> actually).
> 
> Does that seem at all close to being reasonable/normal?  I am calling
> optimize() before closing the index.
> 
> Thanks for the sanity check.
> 
>   Erik
> 
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> 
> 


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message