commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam Halliday <>
Subject Re: [math] Re: commons-math, matrix-toolkits-java and consolidation
Date Thu, 21 May 2009 10:13:51 GMT

Bill, I've had a look at some of the recent changes... some comments,
including some new thoughts:-

- changing the return type to be actual classes was only supposed to be for
copy() for 2.0. Doing this on multiply, add, etc is a big mistake... there
is no guarantee that is the best thing to do and it is likely this will
change in the future. This stands for all implementations. Great thought
needs to go into each method, not just by reading the current

- I *strongly* urge you to remove Serializable from everything! Please, we
did this in MTJ and it turned out to be a major pain. A more appropriate
approach is to define a class for reading/writing Matrix Market files. This
can be a new feature in 2.1. If you're going to leave it, at least document
that the Serializable form is not guaranteed to remain compatible across

- I discourage the use of the classes named *Impl. They will get very
confusing when other implementations are added later! Instead, I recommend
the names ArrayRealVector, Array2DRowRealMatrix (to indicate a 2D array
backed implementation using row ordering). This allows a column-based or 1D
implementation in the future without names getting very confusing. These
implementations are hidden from users who just use the MatrixUtils help

- -1 for inclusion of methods in the sparse interfaces and shape type enum.
This needs more thought and discussion before inclusion. I am happy enough
for marker interfaces to be included (as long as it is documented that these
interfaces are subject to update in the next release), but strongly against
including methods in them.

- could you please describe the hashing algorithm that OpenMap is using? I
have previously written a GNU Trove based sparse matrix implementation which
was a map from the primitive "long" to "double". It looks analagous. I broke
the row/column index into a long for the key using binary operations, but
the hashcode was tricky as it had to be an integer (not long). A naive
implementation can lead to collisions for typical or symmetric matrices. It
looks like the current implementation is using an int -> double backing map!
Wouldn't it make more sense to use a long?

The following code might be useful to you (don't forget the L marker and
casts or the compiler will silently cast to an int), there are two options
for a hashcode calculator... I've not stress tested the second one but it
might be more appropriate as it gives more weight to lower bits. Remember
that FF is 256, or a full byte... so a full 64 bit long is 8 bytes or 16 hex
characters in Java bitwise.

       int hash1(long val) {
               int first = Integer.reverse(key2row(val));
               int second = key2col(val);
               return first + second;

       int hash2(long val) {
               int first = key2row(val) & 0x0000FFFF;
               int second = key2col(val) << 16;
               return first + second;

       long key(int row, int column) {
               return (((long) row) << 32) + (long) column;

       int key2col(long key) {
               return (int) (key & 0x00000000FFFFFFFFL);

       int key2row(long key) {
               return (int) (key >>> 32);

Bill Barker wrote:
> I have a slight preference for leaving the markers empty until 3.0, but I 
> can remove them as well.  But I can wait to see what the community
> consensus 
> is before making changes.

View this message in context:
Sent from the Commons - Dev mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message