mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Odd vector iteration behavior
Date Fri, 12 Apr 2013 19:37:47 GMT
Yeah you have to have a clone() or copy constructor or something. Add
that? that solves the other problem. At least, that's the intent and
it is an option.

If you need the value to be long-lived, one way or the other a new
value is getting created. There's no way around it. There's only an
optimization to be had when it need not be long lived. For example the
iterators over sequence file key/values offer exactly this choice and
the default is to *not* reuse.

The reuse is a nice win in cases where you know it's going to be a benefit.
If this particular iterator will never possibly take advantage of this
optimization, there's no value in offering it, yes.

You could offer the option, if it is used in cases where the
optimization works, but that's extra complexity. I think I'd just keep
an eye on performance impact and consider that option to have it both
ways as a possible solution if needed.

On Fri, Apr 12, 2013 at 8:31 PM, Dan Filimon
<dangeorge.filimon@gmail.com> wrote:
> The problem with the 0th option is that the elements provide no way of
> cloning them.
>
> Also, reusing the value causes the problem in the first post, where the
> element advances even though it shouldn't.
> Calling next(), keeping the reference and expecting the value to not change
> when calling hasNext() seems reasonable to me.
>
> It seems non-intuitive that it should be cloned, because calling hasNext()
> could invalidate it. In my mind, the current element should only change
> when calling next().
>
>
>
> On Fri, Apr 12, 2013 at 10:22 PM, Sean Owen <srowen@gmail.com> wrote:
>
>> I'm sure I did (at least much of) the AbstractIterator change so blame
>> me... but I think the pattern itself is just fine. It's used in many
>> places in the project. Reusing the value object is a big win in some
>> places. Allocating objects is fast but a trillion of them still adds
>> up.
>>
>> It does contain a requirement, and that is that the caller is supposed
>> to copy/clone the value if it will be used at all after the next
>> iterator operation. That's the 0th option, to just fix the caller
>> here.
>>
>> On Fri, Apr 12, 2013 at 7:49 PM, Ted Dunning <ted.dunning@gmail.com>
>> wrote:
>> > The contract of computeNext is that there are no side effects visible
>> > outside (i.e. apparent functional style).  This is required since
>> > computeNext is called from hasNext().
>> >
>> > We are using a side-effecting style so we have a bug.
>> >
>> > We have two choices:
>> >
>> > a) use functional style. This will *require* that we allocate a new
>> > container element on every call to computeNext.  This is best for the
>> user
>> > because they will have fewer surprising bugs due to reuse.  If allocation
>> > is actually as bad as some people think (I remain skeptical of that
>> without
>> > tests) then this is a bad move.  If allocation of totally ephemeral
>> objects
>> > is as cheap as I think, then this would be a good move.
>> >
>> > b) stop using AbstractIterator and continue with the re-use style.  And
>> add
>> > a comment to prevent a bright spark from reverting this change.  (I
>> suspect
>> > that the bright spark who did this in the first place was me so I can be
>> > rude)
>>

Mime
View raw message