lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <>
Subject Re: Using long instead of int for docIds
Date Tue, 12 Oct 2010 03:11:59 GMT
I think ints instead of longs for docids is still the best practical
choice for today.
- longs double the size it takes to store collected ids
- Java native arrays are indexed by int (hence we couldn't collect
more than 2B matches easily anyway)
- the practical limit for a single lucene index is ~100M docs anyway

But, perhaps MultiSearcher (or a new class called BigMultiSearcher)
should start using longs.


On Mon, Oct 11, 2010 at 1:24 AM, Israel Ekpo <> wrote:
> Hi Solr Devs,
> I have always had this question at the back of my mind and I would love to
> know the answers to a couple of questions.
> 1. Does using int for document ids place any restrictions on the number of
> documents that can be stored in a single index? I am assuming we cannot go
> beyond 2 to power 31 minus 1 documents but I have not actually test this
> yet.
> 2. What would it take to change the core to use long instead of int for
> document ids?
> 3. Would there be any practical gains or benefits of making such a change?
> I initially wanted to send this question to the Stomp the Chomp challenge
> but I figured it would be better to open it to all.
> Any useful feedbacks will be highly appreciated.
> --
> °O°
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message