On Tue, Aug 16, 2011 at 02:09:03PM 0700, Ted Dunning wrote:
> Here is an example from the perspective of somebody adding a new kind of
> matrix.
>
> Take the two kinds of matrix as RandomTrinaryMatrix(rows, columns, p) that
> has elements that are 1, 0 or 1. 1 and 1 have equal probabilities of p/2.
> The value of p should be in [0,1].
>
> It would be very nice if the implementor of this matrix could extend an
> abstract matrix and override get() to generate a value and set() to throw
> an unsupported operation exception.
Do you mean that the matrix is stateless (each call to "get" generates a new
value)?
> If p < 0.1, then the matrix should be
> marked as sparse, else as dense.
>
> All operations against other matrices, sparse or dense should work well
> without any special handling by the implementor of this matrix.
>
> This works in Mahout for instance by having the default operations in
> AbstractMatrix test for sparseness of left or right operands and do the
> right thing. Obviously, a type test will not tell you whether this matrix
> is sparse or not.
As far as I understand, the sparseness test is an indicator that there is an
optimized way to iterate over the entries (faster than a loop over all the
indices) or that not all entries are explicitly stored. Is that correct?
If so, this is tied to the layout of the matrix, i.e. the type (which
can change depending on an input value, such as "p" above).
However, once it is created, its type/implementation is either dense or
sparse.
> [...]
I'm wondering whether this leads again to the issue of CM being a framework
for everyone to build on, or if its data structures are mostly intended for
internal use, so that the algorithms it provides are most efficient.
Gilles

To unsubscribe, email: devunsubscribe@commons.apache.org
For additional commands, email: devhelp@commons.apache.org
