commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramesh Rajendran <rame...@s4carlisle.com>
Subject Re: [jira] Commented: (MATH-230) Implement Sparse Matrix Support
Date Fri, 05 Dec 2008 15:24:25 GMT
Ismael Juma (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/MATH-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653805#action_12653805
] 
>
> Ismael Juma commented on MATH-230:
> ----------------------------------
>
> Or we could just add the appropriate specialised Map to commons-math. Are there objections
to that? I could provide one, it's relatively easy to do. I think the best approach might
even be one not mentioned above. Use a Map.Entry object with two ints and a double. This has
a few advantages:
>
> (1) We use chaining instead of open addressing, so we can use a higher load factor with
good performance (e.g. 0.75 instead of 0.50). This should mitigate the additional memory used
by the Map.Entry object.
>
> (2) We use a single array of Map.Entry objects instead of two arrays (one for keys and
one for values). When testing the performance for Scala HashMap implementations (which converts
to bytecode very similar to the one generated by Java), it seems that using two arrays performs
quite a bit worse (although the test was using Strings instead of primitives for the value)[1].
>
> (3) Gets don't require any boxing if we provide a get(int, int) method (this is the same
as using open addressing with primitives only though).
>
> A rough sample of the implementation outline:
>
> class IntPairToDoubleHashMap {
>   static class Entry {
>     final int firstKey;
>     final int secondKey;
>     double value;
>   }
>   int size;
>   Entry[] entries;
>   
>   double get(int firstKey, int secondKey)
>   double put(int firstKey, int secondKey, double value)
> }
>
> This would be an internal map for usage by SparseRealMatrixImpl. I think HotSpot would
be able to inline calls to it, but we could also just have the relevant code in SparseMatrixImpl.
I think it's important to provide an implementation that has the right performance characteristics
by default.
>
> What do you think?
>
> [1] http://www.nabble.com/-scala--Open-versus-Chained-HashMap-td19254845.html
>
>   
>> Implement Sparse Matrix Support
>> -------------------------------
>>
>>                 Key: MATH-230
>>                 URL: https://issues.apache.org/jira/browse/MATH-230
>>             Project: Commons Math
>>          Issue Type: Improvement
>>    Affects Versions: 2.0
>>         Environment: N/A
>>            Reporter: Sujit Pal
>>            Assignee: Luc Maisonobe
>>            Priority: Minor
>>             Fix For: 2.0
>>
>>         Attachments: patch.txt, RealMatrixImplPerformanceTest.java, SparseRealMatrixImpl.java,
SparseRealMatrixImplTest.java
>>
>>
>> I needed a way to deal with large sparse matrices using commons-math RealMatrix,
so I implemented it. The SparseRealMatrixImpl is a subclass of RealMatrixImpl, and the backing
data structure is a Map<Point,Double>, where Point is a struct like inner-class which
exposes two int parameters row and column. I had to make some changes to the existing components
to keep the code for SparseRealMatrixImpl clean. Here are the details.
>> 1) RealMatrix.java:
>>    - added a new method setEntry(int, int, double) to set data into a matrix
>> 2) RealMatrixImpl.java:
>>    - changed all internal calls to data[i][j] to getEntry(i,j).
>>    - for some methods such as add(), subtract(), premultiply(), etc, there
>>      was code that checked for ClassCastException and had two versions,
>>      one for a generic RealMatrix and one for a RealMatrixImpl. This has
>>      been changed to have only one that operates on a RealMatrix. The 
>>      result is something like auto-type casting. So if:
>>        RealMatrixImpl.add(RealMatrix) returns a RealMatrixImpl
>>        SparseRealMatrixImpl.add(RealMatrix) returns a SparseRealMatrixImpl
>> 3) SparseRealMatrixImpl added as a subclass of RealMatrixImpl.
>> 4) LUDecompositionImpl changed to use a clone of the passed in RealMatrix
>>    instead of its data[][] block, and now it uses clone.getEntry(row,col)
>>    calls instead of data[row][col] calls.
>> 5) LUDecompositionImpl returned RealMatrixImpl for getL(), getU(), getP()
>>    and solve(). It now returns the same RealMatrix impl that is passed 
>>    in through its constructor for these methods.
>> 6) New test for SparseRealMatrixImpl, mimics the tests in RealMatrixImplTest,
>> 7) New static method to create SparseRealMatrixImpl out of a double[][] in
>>    MatrixUtils.createSparseRealMatrix().
>>    but using SparseRealMatrixImpl.
>> 8) Verified that all JUnit tests pass.
>>     
>
>   

*****************************************************************
This E-mail is confidential and intended only for the recipients to whom it is addressed.
If you were not an intended recipient, please notify the sender and delete all copies.
S4Carlisle does not accept liability for electronic file transfers.
*****************************************************************

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message