lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: Index Size
Date Thu, 19 Aug 2004 07:16:42 GMT
On Wednesday 18 August 2004 22:44, Rob Jose wrote:
> Hello
> I have indexed several thousand (52 to be exact) text files and I keep
> running out of disk space to store the indexes.  The size of the documents
> I have indexed is around 2.5 GB.  The size of the Lucene indexes is around
> 287 GB.  Does this seem correct?  I am not storing the contents of the

As noted, one would expect the index size to be about 35%
of the original text, ie. about 2.5GB * 35% = 800MB.
That is two orders of magnitude off from what you have.

Could you provide some more information about the field structure,
ie. how many fields, which fields are stored, which fields are indexed,
evt. use of non standard analyzers, and evt. non standard
Lucene settings?

You might also try to change to non compound format to have a look
at the sizes of the individual index files, see file formats on the lucene
web site.
You can then see the total disk size of for example the stored fields.

Regards,
Paul Elschot


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message