lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <>
Subject Re: Adding large files to index
Date Wed, 25 Apr 2007 23:34:31 GMT
David Xiao wrote:
> Consider reduce size of per file. Split them into smaller pieces will
> definitely help indexer working faster.
> A 50M pure text file is amazing size, very few text files reach that
> size: 50M. It must be very reasonable if you have to keep all
> information in such one big file.
> What you think?

Not everyone using Lucene is writing a CMS.  Some of us do have to deal 
with arbitrarily big data, if it appears.  How do you split an 
arbitrary, large text file?  Will breaking it in two make some queries 
not work, e.g. if the user enters +term1 +term2 and each one was on 
opposite sides of the split? etc.

That being said, I haven't had issues adding files of this size.  But 
then, our application doesn't require the ability to read at the same 
time some other thread is writing (so our memory requirements are lower 
to begin with.)


Daniel Noll

Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia    Ph: +61 2 9280 0699
Web:                               Fax: +61 2 9212 6902

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message