commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <garydgreg...@gmail.com>
Subject Fwd: svn commit: r1632210 - /commons/proper/imaging/trunk/src/main/java/org/apache/commons/imaging/formats/jpeg/decoder/Dct.java
Date Thu, 16 Oct 2014 05:01:45 GMT
Good to see you pop up!

What are your thoughts on getting to 1.0?

Gary

---------- Forwarded message ----------
From: <damjan@apache.org>
Date: Thu, Oct 16, 2014 at 12:49 AM
Subject: svn commit: r1632210 -
/commons/proper/imaging/trunk/src/main/java/org/apache/commons/imaging/formats/jpeg/decoder/Dct.java
To: commits@commons.apache.org


Author: damjan
Date: Thu Oct 16 04:49:30 2014
New Revision: 1632210

URL: http://svn.apache.org/r1632210
Log:
Format some comments better.


Modified:

commons/proper/imaging/trunk/src/main/java/org/apache/commons/imaging/formats/jpeg/decoder/Dct.java

Modified:
commons/proper/imaging/trunk/src/main/java/org/apache/commons/imaging/formats/jpeg/decoder/Dct.java
URL:
http://svn.apache.org/viewvc/commons/proper/imaging/trunk/src/main/java/org/apache/commons/imaging/formats/jpeg/decoder/Dct.java?rev=1632210&r1=1632209&r2=1632210&view=diff
==============================================================================
---
commons/proper/imaging/trunk/src/main/java/org/apache/commons/imaging/formats/jpeg/decoder/Dct.java
(original)
+++
commons/proper/imaging/trunk/src/main/java/org/apache/commons/imaging/formats/jpeg/decoder/Dct.java
Thu Oct 16 04:49:30 2014
@@ -22,10 +22,13 @@ final class Dct {
      * Here's the cost, exluding modified (de)quantization, for
transforming an
      * 8x8 block:
      *
-     * Algorithm Adds Multiplies RightShifts Total Naive 896 1024 0 1920
-     * "Symmetries" 448 224 0 672 Vetterli and 464 208 0 672 Ligtenberg
Arai,
-     * Agui and 464 80 0 544 Nakajima (AA&N) Feig 8x8 462 54 6 522 Fused
mul/add
-     * 416 (a pipe dream)
+     * Algorithm                     Adds Multiplies RightShifts Total
+     * Naive                          896       1024           0  1920
+     * "Symmetries"                   448        224           0   672
+     * Vetterli and Ligtenberg        464        208           0   672
+     * Arai, Agui and Nakajima (AA&N) 464         80           0   544
+     * Feig 8x8                       462         54           6   522
+     * Fused mul/add (a pipe dream)                                416
      *
      * IJG's libjpeg, FFmpeg, and a number of others use AA&N.
      *
@@ -33,21 +36,25 @@ final class Dct {
      * are reduced from 80 in AA&N to only 54. But in practice:
      *
      * Benchmarks, Intel Core i3 @ 2.93 GHz in long mode, 4 GB RAM Time
taken to
-     * do 100 million IDCTs (less is better): Rene' Stöckel's Feig, int:
45.07
-     * seconds My Feig, floating point: 36.252 seconds AA&N, unrolled
loops,
-     * double[][] -> double[][]: 25.167 seconds
+     * do 100 million IDCTs (less is better):
+     * Rene' Stöckel's Feig, int: 45.07 seconds
+     * My Feig, floating point: 36.252 seconds
+     * AA&N, unrolled loops, double[][] -> double[][]: 25.167 seconds
      *
      * Clearly Feig is hopeless. I suspect the performance killer is
simply the
      * weight of the algorithm: massive number of local variables, large
code
      * size, and lots of random array accesses.
      *
-     * Also, AA&N can be optimized a lot: AA&N, rolled loops, double[][] ->
-     * double[][]: 21.162 seconds AA&N, rolled loops, float[][] ->
float[][]: no
-     * improvement, but at some stage Hotspot might start doing SIMD, so
let's
+     * Also, AA&N can be optimized a lot:
+     * AA&N, rolled loops, double[][] -> double[][]: 21.162 seconds
+     * AA&N, rolled loops, float[][] -> float[][]: no improvement,
+     * but at some stage Hotspot might start doing SIMD, so let's
      * use float AA&N, rolled loops, float[] -> float[][]: 19.979 seconds
-     * apparently 2D arrays are slow! AA&N, rolled loops, inlined 1D AA&N
-     * transform, float[] transformed in-place: 18.5 seconds AA&N, previous
-     * version rewritten in C and compiled with "gcc -O3" takes: 8.5
seconds
+     * apparently 2D arrays are slow!
+     * AA&N, rolled loops, inlined 1D AA&N
+     * transform, float[] transformed in-place: 18.5 seconds
+     * AA&N, previous version rewritten in C and compiled with "gcc -O3"
+     * takes: 8.5 seconds
      * (probably due to heavy use of SIMD)
      *
      * Other brave attempts: AA&N, best float version converted to 16:16
fixed





-- 
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition
<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message