lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <yo...@lucidimagination.com>
Subject Re: Using long instead of int for docIds
Date Tue, 12 Oct 2010 03:11:59 GMT
I think ints instead of longs for docids is still the best practical
choice for today.
- longs double the size it takes to store collected ids
- Java native arrays are indexed by int (hence we couldn't collect
more than 2B matches easily anyway)
- the practical limit for a single lucene index is ~100M docs anyway

But, perhaps MultiSearcher (or a new class called BigMultiSearcher)
should start using longs.

-Yonik

On Mon, Oct 11, 2010 at 1:24 AM, Israel Ekpo <israelekpo@gmail.com> wrote:
> Hi Solr Devs,
>
> I have always had this question at the back of my mind and I would love to
> know the answers to a couple of questions.
>
> 1. Does using int for document ids place any restrictions on the number of
> documents that can be stored in a single index? I am assuming we cannot go
> beyond 2 to power 31 minus 1 documents but I have not actually test this
> yet.
>
> 2. What would it take to change the core to use long instead of int for
> document ids?
>
> 3. Would there be any practical gains or benefits of making such a change?
>
> I initially wanted to send this question to the Stomp the Chomp challenge
> but I figured it would be better to open it to all.
>
> Any useful feedbacks will be highly appreciated.
>
> --
> °O°
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.
> http://www.israelekpo.com/
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message