lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Siu" <michaely...@hotmail.com>
Subject RE: Limit of Lucene
Date Thu, 08 May 2008 18:13:36 GMT
The # of documents that we are going to index could be potentially more than
2G. So I guess I have to split the index file into multiple of files with
each contain up to 2G files. Any other suggestion?

Thanks.

-----Original Message-----
From: Karl Wettin [mailto:karl.wettin@gmail.com] 
Sent: Thursday, May 08, 2008 11:00 AM
To: java-user@lucene.apache.org
Subject: Re: Limit of Lucene

Michael Siu skrev:
> What is the limit of Lucene:  # of docs per index?

Integer.MAX_VALUE

Multiple indices joined in a single MultiWhatNot is still limited to 
that number.


> If RangeFilter.Bits(), for example, it initializes a bitset to the size of
> maxDoc from the indexReader.  I wonder what happen if the # of docs is
huge,
> say MaxInt (4G in 32bit or 2^63 in 64 bit)?

ArrayIndexOutOfBoundsException ?

It should not be that difficult to upgrade int to longs, but it is a 
rather large job.

How many documents do you have? You might want to consider alternative 
ways to represent your corpus in the index so it takes less documents.


           karl

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message