mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: Broken non-zero iterator in VectorView?
Date Tue, 03 Mar 2015 00:33:23 GMT
I vaguely remember the NonZeroIterator being optimized just as we were switching to Scala.
Something Sebastian was working on but no idea if it was related to this.


On Mar 2, 2015, at 3:51 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

actually the test error is something else but i think vector view
iterator implementation is still wrong. I will scan if that produces
any more errors elsewhere.

On Mon, Mar 2, 2015 at 3:43 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:
> this test is failing after i remove non-reusable elements in views,
> but we should be able to remove that claimi think based on general
> vector contract i don't think view contract should be any special,
> this would defeat inheritance concept ...
> 
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.306
> sec <<< FAILURE! - in org.apache.mahout.math.MatricesTest
> testViewDenseSparseReporting(org.apache.mahout.math.MatricesTest)
> Time elapsed: 0.016 sec  <<< FAILURE!
> java.lang.AssertionError: null
>        at __randomizedtesting.SeedInfo.seed([BB9E332570D97135:8BBE9BBA065B4C3A]:0)
>        at org.junit.Assert.fail(Assert.java:86)
>        at org.junit.Assert.assertTrue(Assert.java:41)
>        at org.junit.Assert.assertTrue(Assert.java:52)
>        at org.apache.mahout.math.MatricesTest.testViewDenseSparseReporting(MatricesTest.java:69)
> 
> Running org.apache.mahout.math.VectorBinaryAggregateCostTest
> T
> 
> On Mon, Mar 2, 2015 at 3:41 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:
>> it looks like an attempt to eliminate reusable elements in vector
>> view's iterators but why? Vector contract already implies element
>> reusability inside iterators, so why special treatment inside vector
>> views?
>> 
>> umph.
>> 
>> On Mon, Mar 2, 2015 at 3:35 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:
>>> it looks like assigning 0s in a view of SequentialAccessSparseVector
>>> doesn't work, as it internally using setQuick() which tirms the length
>>> of non-zero elements (?)
>>> which causes invalidation of the iterator state.
>>> 
>>> in particular, this simple test fails:
>>> 
>>> val svec = new SequentialAccessSparseVector(30)
>>> 
>>> svec(1) = -0.5
>>> svec(3) = 0.5
>>> println(svec)
>>> 
>>> svec(1 until svec.length) ::= ( _ => 0)
>>> println(svec)
>>> 
>>> svec.sum shouldBe 0
>>> 
>>> This produces  output
>>> 
>>> {1:-0.5,3:0.5}
>>> {3:0.5}
>>> 
>>> The reason seems to be in the way vector view handles non-zero iterator:
>>> 
>>> public final class NonZeroIterator extends AbstractIterator<Element> {
>>> 
>>>  private final Iterator<Element> it;
>>> 
>>>  private NonZeroIterator() {
>>>    it = vector.nonZeroes().iterator();
>>>  }
>>> 
>>>  @Override
>>>  protected Element computeNext() {
>>>    while (it.hasNext()) {
>>>      Element el = it.next();
>>>      if (isInView(el.index()) && el.get() != 0) {
>>>        Element decorated = vector.getElement(el.index());
>>>        return new DecoratorElement(decorated);
>>>      }
>>>    }
>>>    return endOfData();
>>>  }
>>> 
>>> }
>>> 
>>> 
>>> In particular, the problem lies in the line
>>> 
>>>        Element decorated = vector.getElement(el.index());
>>> 
>>> This creates yet another element which is AbstractVector.LocalElement
>>> instead of iterator's element which would not cause iterator state
>>> invalidation during assignment.
>>> 
>>> The question is why it tries to create yet another element instead of
>>> decorating iterator's element itself in the Vector View???
>>> 
>>> I would just replace this line with simply
>>> 
>>> Element decorated = el
>>> 
>>> but i guess it might break something? what it is it might break?


Mime
View raw message