commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From luc <...@spaceroots.org>
Subject Re: [math] noob; performance metrics?
Date Wed, 24 Jun 2015 10:18:35 GMT
Le 2015-06-23 22:44, Luc Maisonobe a écrit :
> Hi Andrew,
> 
> Le 23/06/2015 19:08, Andrew E. Davidson a écrit :
>> sorry if this has been asked many times before. (maybe this can be
>> added to the FAQ?)
>> 
>> has anyone done any bench marking?
> 
> Yes.
> 
>> 
>> The idea of having a math package that is implemented pure java is
>> very attractive. My experience with machine learning is that java is
>> very slow. To go fast you need to take advantage of assembler or
>> libraries written in fortran or C. For example http://jblas.org/
>> <http://jblas.org/>
> 
> It is not that simple, and in some case, it can be slower ...
> 
> I did not find the benchmark I presented at several symposium in
> 2010, but here are some rough results.

I have found again the plot from 2010 and put it there:
   <https://people.apache.org/~luc/performances-QR.png>

Contrary to what I wrote below, the plot goes only to 1000x1000,
not 4000x4000, sorry for the confusion.

Luc

> 
> The tests were done on the QR decomposition plus solving of an
> A.X = B linear problem, with dense matrices. I did it for dimensions
> up to 4000x4000 if I remember well. The benchmark was made using the
> same underlying algorithm (but obvisouly different implementations).
> 
> The results were, in increasing performance :
> 
>   - Numerical Recipes in fortran, non-optimized
>   - Numerical Recipes in fortran, optimized
>   - LAPACK with ATLAS as a BLAS implementation
>     (almost no difference in non-optimized or optimized)
>   - Apache Commons Math !
> 
> Well we were only about 2% faster than LAPACK, and it was on only
> one algorithm type, on my machine. I was happy and in fact surprised,
> I did not expect we could reach LAPACK performances. A more realistic
> result is to look also for other algorithms.
> 
> Answering your question for the general case, is however more
> difficult, and here I don't have real benchmarks, only some general
> feelings. I would say that accross different domains, the speed
> differences that can be observed are typically a factor 1.5 or 2 (Java
> being slower), which is a significant difference but clearly not as
> important as most people think. In fact, there are many factors other
> than language that are also in this domain of 1.5 or 2.
> 
> The lessons I learnt here are *not* that we are faster (for most
> operation, I am sure we are slower), but rather than language is
> only one factor for speed. Change the algorithm and you change the
> speed. Change the compiler and you change the speed. Change the
> optimizer and you change the speed. Change the human developer and
> you change the speed, change your computer for one that is only a
> few months more recent and you change the speed ...
> 
> Attempting to use Java-fortran native interface to get speed is
> almost always a bad idea. The reason is that the layer between
> the languages is difficult to go through and really slow. You
> will spend much of the type in this layer rather than in real
> processing code. This is especially true for matrices because
> double[][] are not packed as a lot of double numbers in some
> specified order after an initial pointer, you often have to
> copy between Java arrays (which are objects) to C or fortran
> arrays and you lose a lot of time doing copies.
> 
> Java-fortran native interface is useful, but not for speed
> considerations. From my experience, it is more useful for
> interfacing with libraries that have only one implementation
> and that you cannot afford to port (because they are huge,
> because they are highly domain specific and nobody else use
> them, because they have been validated and you cannot take
> the risk to introduce a bug by porting them, because you don't
> have the time, or because you don't have the money).
> 
> In my domain (space systems), we use extensively Apache Commons
> Math and some upper level Java libraries and since a few years
> we have replaced many older fortran and C libraries. In all cases,
> we are either as fast or much faster. This is mainly because
> when developping these replacements libraries, we have chosen
> different architectures, used newer algorithms, used different
> trade-offs between memory and processing than what was available
> to engineers 20 or 30 years ago. For sure, if they were to develop
> again their libraries in fortran by now, they will also improve
> their results. So what is important is what you can achieve at
> present time, using present algorithms and present languages.
> 
> If your work is really focused on linear algebra, there are
> other Java libraries that are faster than Apache Commons Math
> for this specific domain (some use native interface, some don't).
> Linear algebra is one of our weak points. Apache Commons Math
> is a library with a broad coverage, not a specialized one for
> linear algebra only.
> 
> So as a summary, yes there have been some benchmarks. Yes
> Java can be fast (and it can also be slow depending on how
> well it is developed, just like all other languages).
> 
> best regards,
> Luc
> 
>> 
>> 
>> Kind Regards
>> 
>> Andy
>> 
>> 
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Mime
View raw message