Mailing-List: contact dev-help@commons.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Commons Developers List" <dev@commons.apache.org>
Received-SPF: neutral (nike.apache.org: 209.85.217.174 is neither permitted
 nor denied by domain of mberman@sqrrl.com)
MIME-Version: 1.0
In-Reply-To: <54170C2A.7090900@gmail.com>
References: 
 <CAA+ACqfMYAsZAW+=Cek1B3EULcp3zF5aJfu8MQkjSdjWfzTC1A@mail.gmail.com>
 <54072D0B.1060907@gmail.com>
 <CAA+ACqdC9ELJP11xh_LfKSYsjka59kKivEsMrisf5muggbh3Yw@mail.gmail.com>
 <5409D6A6.7040708@gmail.com>
 <CAA+ACqfLQ8eNwd3Q4=P1yjWdV9fNb9r517V33L_MAY9Qfwz_HA@mail.gmail.com>
 <540B7B17.6050107@gmail.com>
 <CAA+ACqeFivpTEF_Vyo9GxWvJ2QLeQwScCdRncH7QE-qJSw++iQ@mail.gmail.com>
 <5411C60F.40700@gmail.com> <5411EC4A.4030405@gmail.com>
 <54170C2A.7090900@gmail.com>
From: Michael Berman <mberman@sqrrl.com>
Date: Mon, 15 Sep 2014 12:02:43 -0400
Message-ID: 
 <CAA+ACqcE=EQh4i00GQDCGzQwnpdaratD_tggykywca27hMg_xQ@mail.gmail.com>
Subject: Re: [pool] POOL-272: per-key numTestsPerEvictionRun
To: Commons Developers List <dev@commons.apache.org>
Content-Type: multipart/alternative; boundary=089e0158c31a11977005031cc554

--089e0158c31a11977005031cc554
Content-Type: text/plain; charset=UTF-8

I'm fine with pushing it off to 2.4, since it doesn't seem like we're
converging on an obviously preferred implementation.

That said, I'm not too worried about maintaining the per-key iterator, even
though the key list is dynamic. The eviction process already maintains its
own list of keys, which is maintained in parallel with the list of keys in
pool. It seems like switching from a list<key> to a map<key,iterator> would
be straightforward, and we could continue maintaining it in exactly the
same places the existing list is managed.

On Mon, Sep 15, 2014 at 11:56 AM, Phil Steitz <phil.steitz@gmail.com> wrote:

> On 9/11/14 11:39 AM, Phil Steitz wrote:
> > On 9/11/14 8:55 AM, Phil Steitz wrote:
> >> On 9/9/14 11:38 AM, Michael Berman wrote:
> >>> Great, thanks! I have all the new config property boilerplate in place
> and
> >>> I understand how it should work, but I'm debating the best way to
> implement
> >>> the per-key queues. It would be nice if the test iteration for each key
> >>> resumed in the same place the next time the cycle hits that key, but
> this
> >>> implies keeping around an iterator for every key. Right now, the test
> cycle
> >>> attempts to start from where it left off within the key, although it
> gives
> >>> up and moves onto the next key pretty aggressively if anything goes
> wrong,
> >>> and then starts back at the beginning the next time it comes back
> around.
> >>> Do you think that behavior should be preserved?
> >> I see the problem and I suspect current behavior won't work for what
> >> you or anyone else wanting this feature would like.  Behavior would
> >> be OK for the key where the evictor left off, but visits would then
> >> revert to the same initial instances per key later on.  I can't
> >> think of a reliable way around this other than maintaining iterators
> >> for each key, as you suggest.
> > Thinking about this some more, I am not sure the per key iterator
> > idea is so hot.  This will be very tricky to maintain given that the
> > key list itself is dynamic.  Other options would be to select the
> > instances to hit per key randomly somehow or just keep track of
> > "offsets" per key.  The latter would not be precise or accurate
> > given pool membership changes, but would keep the evictor from
> > getting stuck hitting the beginning of the dequeue all of the time.
> > Random "offsets" might be a decent strategy for most practical use
> > cases and would be easy to implement.
>
> I guess above is not appealing as it does not guarantee sequential
> order (or ever) evictor visits.  I will summarize above to the
> ticket and push this issue out to 2.4 if thats OK.
>
> Phil
> >
> > Phil
> >> Phil
> >>> As far as maintaining iterators for each key goes, we could convert
> >>> poolKeyList, which is only read by the evictor process now into a
> >>> Map<K,Optional<Iterator<T>>> or something to track our place iterating
> >>> through the Deque for keys we've already seen (and just insert None
> when a
> >>> new key comes into existence, when we finish an iteration, or if we
> bail on
> >>> an iteration after coming across a borrowed object). Does that sound
> >>> reasonable to you?
> >>>
> >>> On Sat, Sep 6, 2014 at 5:22 PM, Phil Steitz <phil.steitz@gmail.com>
> wrote:
> >>>
> >>>> On 9/5/14 8:44 AM, Michael Berman wrote:
> >>>>> I was proposing having both properties be maxes, not mins; whichever
> we
> >>>> hit
> >>>>> first is the limit.
> >>>> Sorry, I misunderstood.  That sounds fine to me.  I would be happy
> >>>> to work with you on a patch implementing this.  If you want to take
> >>>> a stab at it, just attach the patch (including unit test) to the
> >>>> ticket.
> >>>>
> >>>> Phil
> >>>>>  I agree that not having a cap on the total number of
> >>>>> tests performed is dangerous (I wouldn't really set it to max int;
> just
> >>>>> something pretty high). However, there is some precedent for
> unbounded
> >>>>> tests: currently, if you set the limit to a negative, the number of
> tests
> >>>>> scales with the number of values. This doesn't seem fundamentally
> more
> >>>>> dangerous than scaling with the number of keys.
> >>>>>
> >>>>> In some situations, though, it's true that it's important to be able
> to
> >>>> cap
> >>>>> the total number of tests. By the same logic, I may also not want an
> >>>>> unbounded number of tests to be able to occur for each key, which
> makes
> >>>> the
> >>>>> "there will always be numTests performed regardless of the perKey
> >>>> setting"
> >>>>> option sound scary. Maybe there should be another switch for whether
> we
> >>>>> want the min or the max?
> >>>>>
> >>>>>
> >>>>> On Fri, Sep 5, 2014 at 11:28 AM, Phil Steitz <phil.steitz@gmail.com>
> >>>> wrote:
> >>>>>> On 9/3/14 8:53 AM, Michael Berman wrote:
> >>>>>>> Thanks for the feedback Phil!
> >>>>>>>
> >>>>>>>
> >>>>>>>> One way to do this would be to have the current property be a cap
> on
> >>>>>>>> the total number of tests performed, but the new one basically
> >>>>>>>> control the number done per key.  So if, e.g.
> numTestsPerEvictionRun
> >>>>>>>> is 12, numEvictionTestsPerKey (or whatever we decide to call it)
> is
> >>>>>>>> 3,  the evictor does 3 tests for each key and then moves on to the
> >>>>>>>> next immediately, up to 12.  I guess if there are fewer than
> >>>>>>>> numEvictionTestsPerKey, the evictor just moves on.  If the evictor
> >>>>>>>> gets all the way through before numTestsPerEcvictionRun it could
> >>>>>>>> wrap around or just stop.  Would something like that work for you?
> >>>>>>>> Anyone have any better ideas?
> >>>>>>> "No more than numEvictionTestsPerKey per key and no more than
> >>>>>>> numTestsPerEvictionRun total" would definitely address my use case
> (in
> >>>> my
> >>>>>>> case, I'll just set numTestsPerEvictionRun to Int.MAX or
> something),
> >>>> but
> >>>>>> I
> >>>>>>> do wonder if that's what everyone would want. I could imagine
> expecting
> >>>>>> the
> >>>>>>> behavior to be "At least numEvictionTestsPerKey per key and at
> least
> >>>>>>> numTestsPerEvictionRun total" (which would be the behavior if we
> >>>> wrapped
> >>>>>>> around instead of just stopping). I suppose it's easy to make it
> clear
> >>>>>>> which behavior we have in documentation, but that presumes the one
> I
> >>>> want
> >>>>>>> is likely to be universally more useful than the alternative.
> Think I
> >>>> can
> >>>>>>> make that assumption?
> >>>>>> We would have to rename numTestsPerEvictionRun if we were to change
> >>>>>> it to mean a min.  I would personally not like that.  I would rather
> >>>>>> leave it defined as is and just have the new property control when
> >>>>>> the evictor moves from key to key.  Another option would be to have
> >>>>>> the new property override the old one - so if numEvictionTestsPerKey
> >>>>>> is set (i.e. not the default), then numTestsPerEvictionRun is
> >>>>>> ignored and the evictor is controlled completely by the second one.
> >>>>>>
> >>>>>> Personally, I would favor the first of my suggestions above -
> >>>>>> numTestsPerEvictionRun determines the total number of tests
> >>>>>> performed and numEvictionTestsPerKey determines how many tests are
> >>>>>> done in each key set before moving to the next key.  Not having a
> >>>>>> cap on the total number of tests performed is dangerous, as there is
> >>>>>> no bound to the number of keys so just doing numEvictionTestsPerKey
> >>>>>> for every key each evictor run could result in very long-running
> >>>>>> evictors.
> >>>>>>
> >>>>>> It would be good to get some other opinions on this.  Anyone have
> >>>>>> ideas / preferences?
> >>>>>>
> >>>>>> Phil
> >>>>>>
> >>>>>>
> >>>>>>> With respect to naming:
> >>>>>>> numEvictionTestsPerKey is definitely more concise and readable than
> >>>>>>> numTestsPerEvictionRunPerKey, but I wonder if it makes it less
> obvious
> >>>>>> that
> >>>>>>> it's tightly connected to numTestsPerEvictionRun. In code
> completion,
> >>>> the
> >>>>>>> setters wouldn't appear next to each other, which might hurt
> >>>>>>> discoverability. Maybe just referencing each other in docs is
> enough to
> >>>>>>> mitigate that. Any votes one way or the other?
> >>>>>>>
> >>>>>>
> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >>>>>> For additional commands, e-mail: dev-help@commons.apache.org
> >>>>>>
> >>>>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >>>> For additional commands, e-mail: dev-help@commons.apache.org
> >>>>
> >>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

--089e0158c31a11977005031cc554--