commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From luc.maison...@free.fr
Subject Re: [math] Performance optimization in the vector classes
Date Mon, 02 Nov 2009 15:41:13 GMT

----- "Jake Mannix" <jake.mannix@gmail.com> a écrit :

> Hey all,
> 
>   In digging through ArrayRealVector and OpenMapRealVector, while
> trying to
> draw up patches for MATH-312 and MATH-314, I found a number of
> performance (
> and other) issues:
> 
> 1) in add(double[]), subtract(double[]), and mapXXX methods, the idiom
> is:
> 
>   double[] v = new double[data.length];
>   for(int i=0; i<v.length; i++) {
>     v[i] = data[i] + otherVector[i]; // for example, for add()
>   }
>   return new ArrayRealVector(v);
> 
> if this were instead:
> 
>   double[] v = data.clone();
>   for(int i=0; i<v.length; i++) {
>     v[i] += otherVector[i];
>   }
>   return new ArrayRealVector(v, false);   // <- ack!  not passing
> false here
> forces a clone() call when you least need it!

This is the reason why the boolean is there.

> 
> The difference of doing only one array access inside the loop alone
> saved
> 30% on profiling tests on my laptop with 1M long double arrays.  I
> don't
> even want to think about the cost of making superflous array copies as
> well.

Fine, good catch. Let's implement it.

> 
> 2) In OpenMapRealVector, any case where two OpenMapRealVectors are
> combined
> together, iterating over the smaller should be preferred, but this is
> never
> checked for.

I think we discussed about similar things in the last few months, and simply forgot to do
it.

> 
> 3) what's with the idiom of:
> 
>   try { return methodName( (ArrayRealVector) v); } catch
> (ClassCastException
> cce) {
> 
> instead of the standard
> 
>   if(v instanceof ArrayRealVector) { return methodName(
> (ArrayRealVector)
> v); } else {
> 
> Is there something I'm missing on why the former shows up all over
> the
> place?  Maybe this isn't a performance issue, but I just noticed it.

In the cast, the check for the class is done internally, and can be optimised by the compiler,
particularly when this method is inlined. So this is an attempt to avoid one if and to let
the compiler directly call the specific methods for ArrayRealVector which is much faster than
the generic one. I understand it is a bit weird.

> 
> 4) ArrayRealVector#unitVector() calls norm() once, and then it calls
> it
> again.  That's clearly unnecessary.

You are right. the lat line should simply use the norm local variable which in fact was introduced
just for this reason.

> Also, why the special case for
> ArithmeticExcption here, for the zero norm case?  Why not just let
> java just
> try to divide, and if it divides by zero, well, it'll throw an
> ArithmeticException itself...  seems like the whole method should just
> be
> implemented as " return mapDivide(norm()); "

No. A division by zero does not lead to ArithmeticException when dealing with double variables,
it leads to a NaN being produced. ArithmeticException is triggered when an integer division
by zero occurs.

> 
> I've fixed many of these in my patch on MATH-312, but there's another
> version of that I need to post once I correctly handle the sparse
> nonzero
> defaultValue case.   I also wasn't sure if you guys usually liked to
> separate your fixes into a bunch of little ones or not.  It's just
> hard for
> me to keep a dozen different patches in sync my end without
> conflicting with
> myself now and again.

We really prefer separate patches as we commit them separately.
This allows having a clear history in the svn repository.

Luc

> 
>   -jake

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message