lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Limit of Lucene
Date Thu, 08 May 2008 18:28:23 GMT
In practice, you will more than likely have to distribute your index  
across multiple nodes once you get somewhere in the range of tens of  
millions of documents, but it all depends on your hardware, documents,  
throughput needs, etc.


On May 8, 2008, at 2:13 PM, Michael Siu wrote:

> The # of documents that we are going to index could be potentially  
> more than
> 2G. So I guess I have to split the index file into multiple of files  
> with
> each contain up to 2G files. Any other suggestion?
>
> Thanks.
>
> -----Original Message-----
> From: Karl Wettin [mailto:karl.wettin@gmail.com]
> Sent: Thursday, May 08, 2008 11:00 AM
> To: java-user@lucene.apache.org
> Subject: Re: Limit of Lucene
>
> Michael Siu skrev:
>> What is the limit of Lucene:  # of docs per index?
>
> Integer.MAX_VALUE
>
> Multiple indices joined in a single MultiWhatNot is still limited to
> that number.
>
>
>> If RangeFilter.Bits(), for example, it initializes a bitset to the  
>> size of
>> maxDoc from the indexReader.  I wonder what happen if the # of docs  
>> is
> huge,
>> say MaxInt (4G in 32bit or 2^63 in 64 bit)?
>
> ArrayIndexOutOfBoundsException ?
>
> It should not be that difficult to upgrade int to longs, but it is a
> rather large job.
>
> How many documents do you have? You might want to consider alternative
> ways to represent your corpus in the index so it takes less documents.
>
>
>           karl
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ







---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message