lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: ArrayUtils.getNextSize
Date Thu, 09 Apr 2009 08:59:45 GMT
Thanks Mike.

On Thu, Apr 9, 2009 at 11:55 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Wed, Apr 8, 2009 at 11:22 PM, Shai Erera <serera@gmail.com> wrote:
> > Hi
> >
> > I used ArrayUtils.getNextSize recently to expand an array to a new size.
> > When I read the documentation (the inline in the method), I saw this:
> >
> > "The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ..."
> >
> > I'm not sure if I'm misunderstanding the comment, or if there is a bug in
> > the implementation. The way I understand it, the method will always
> return
> > 0, 4, 8, 16 ... values, such that if I pass 9-15 I'll always get 16. But
> > then I noticed that if I pass 15, I get back 22 (which is the result of
> the
> > computation). Is it a bug?
>
> Note: this was lifted from Python's sources, in how it grows a list as
> you append elements to it (credit/license reference for this is at the
> end of Lucene's LICENSE.txt).
>
> It's fine that you got 22 back.
>
> The comment really means "if you only use getNextSize to get the
> series of sizes" that's the sequence that's followed.  If you pass in
> your own size (not previously returned by getNextSize), you'll get a
> different sequence, which is fine.
>
> > Also, while on that subject - why don't we return the closest power of 2?
> > I'm not familiar with that expansion policy, and I don't know how it will
> > behave for large values. I guess a pure "closest power of 2" is not too
> good
> > since for very large values it will return twice the size which is not
> too
> > good, but we can have a smarter impl. which checks the value passed and
> > returns a new number in chunks of let's say 1K, when the passed in number
> is
> > very large (let's say 1M).
>
> It's designed to grow exponentially, but at a more lenient rate than
> 2X which indeed for large values is costly.
>
> For smallish values it grows aggressively, then for larger values it
> converges to an oversize of 1/8th (12.5%).
>
> > If you think that's useful, I can open an issue and either change the
> method
> > impl, or add a new one getNextPower2Size.
>
> I don't think we need to change it.  Python has converged on this
> approach over the years ;)
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Mime
View raw message