Return-Path: Delivered-To: apmail-lucene-mahout-commits-archive@minotaur.apache.org Received: (qmail 90607 invoked from network); 23 Nov 2009 15:16:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 23 Nov 2009 15:16:03 -0000 Received: (qmail 50162 invoked by uid 500); 23 Nov 2009 15:16:02 -0000 Delivered-To: apmail-lucene-mahout-commits-archive@lucene.apache.org Received: (qmail 50070 invoked by uid 500); 23 Nov 2009 15:16:01 -0000 Mailing-List: contact mahout-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mahout-dev@lucene.apache.org Delivered-To: mailing list mahout-commits@lucene.apache.org Received: (qmail 49883 invoked by uid 99); 23 Nov 2009 15:16:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Nov 2009 15:16:01 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Nov 2009 15:15:56 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id 9C1892388A94; Mon, 23 Nov 2009 15:14:45 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: svn commit: r883365 [33/47] - in /lucene/mahout/trunk: ./ examples/ matrix/ matrix/src/ matrix/src/main/ matrix/src/main/java/ matrix/src/main/java/org/ matrix/src/main/java/org/apache/ matrix/src/main/java/org/apache/mahout/ matrix/src/main/java/org/a... Date: Mon, 23 Nov 2009 15:14:38 -0000 To: mahout-commits@lucene.apache.org From: gsingers@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20091123151455.9C1892388A94@eris.apache.org> Added: lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/DoubleMatrix2DComparator.java URL: http://svn.apache.org/viewvc/lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/DoubleMatrix2DComparator.java?rev=883365&view=auto ============================================================================== --- lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/DoubleMatrix2DComparator.java (added) +++ lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/DoubleMatrix2DComparator.java Mon Nov 23 15:14:26 2009 @@ -0,0 +1,82 @@ +/* +Copyright � 1999 CERN - European Organization for Nuclear Research. +Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose +is hereby granted without fee, provided that the above copyright notice appear in all copies and +that both that copyright notice and this permission notice appear in supporting documentation. +CERN makes no representations about the suitability of this software for any purpose. +It is provided "as is" without expressed or implied warranty. +*/ +package org.apache.mahout.colt.matrix.doublealgo; + +import org.apache.mahout.colt.matrix.DoubleMatrix2D; +/** + * A comparison function which imposes a total ordering on some + * collection of elements. Comparators can be passed to a sort method (such as + * org.apache.mahout.colt.matrix.doublealgo.Sorting.quickSort) to allow precise control over the sort order.

+ * + * Note: It is generally a good idea for comparators to implement + * java.io.Serializable, as they may be used as ordering methods in + * serializable data structures. In + * order for the data structure to serialize successfully, the comparator (if + * provided) must implement Serializable.

+ * + * @author wolfgang.hoschek@cern.ch + * @version 1.0, 09/24/99 + * @see java.util.Comparator + * @see org.apache.mahout.colt + * @see org.apache.mahout.colt.Sorting + */ +/** + * @deprecated until unit tests are in place. Until this time, this class/interface is unsupported. + */ +@Deprecated +public interface DoubleMatrix2DComparator { +/** + * Compares its two arguments for order. Returns a negative integer, + * zero, or a positive integer as the first argument is less than, equal + * to, or greater than the second.

+ * + * The implementor must ensure that sgn(compare(x, y)) == + * -sgn(compare(y, x)) for all x and y. (This + * implies that compare(x, y) must throw an exception if and only + * if compare(y, x) throws an exception.)

+ * + * The implementor must also ensure that the relation is transitive: + * ((compare(x, y)>0) && (compare(y, z)>0)) implies + * compare(x, z)>0.

+ * + * Finally, the implementer must ensure that compare(x, y)==0 + * implies that sgn(compare(x, z))==sgn(compare(y, z)) for all + * z.

+ * + * + * @return a negative integer, zero, or a positive integer as the + * first argument is less than, equal to, or greater than the + * second. + */ +int compare(DoubleMatrix2D o1, DoubleMatrix2D o2); +/** + * + * Indicates whether some other object is "equal to" this + * Comparator. This method must obey the general contract of + * Object.equals(Object). Additionally, this method can return + * true only if the specified Object is also a comparator + * and it imposes the same ordering as this comparator. Thus, + * comp1.equals(comp2) implies that sgn(comp1.compare(o1, + * o2))==sgn(comp2.compare(o1, o2)) for every element + * o1 and o2.

+ * + * Note that it is always safe not to override + * Object.equals(Object). However, overriding this method may, + * in some cases, improve performance by allowing programs to determine + * that two distinct Comparators impose the same order. + * + * @param obj the reference object with which to compare. + * @return true only if the specified object is also + * a comparator and it imposes the same ordering as this + * comparator. + * @see java.lang.Object#equals(java.lang.Object) + * @see java.lang.Object#hashCode() + */ +boolean equals(Object obj); +} Propchange: lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/DoubleMatrix2DComparator.java ------------------------------------------------------------------------------ svn:eol-style = native Added: lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/Formatter.java URL: http://svn.apache.org/viewvc/lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/Formatter.java?rev=883365&view=auto ============================================================================== --- lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/Formatter.java (added) +++ lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/Formatter.java Mon Nov 23 15:14:26 2009 @@ -0,0 +1,812 @@ +/* +Copyright � 1999 CERN - European Organization for Nuclear Research. +Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose +is hereby granted without fee, provided that the above copyright notice appear in all copies and +that both that copyright notice and this permission notice appear in supporting documentation. +CERN makes no representations about the suitability of this software for any purpose. +It is provided "as is" without expressed or implied warranty. +*/ +package org.apache.mahout.colt.matrix.doublealgo; + +import org.apache.mahout.colt.matrix.DoubleMatrix1D; +import org.apache.mahout.colt.matrix.DoubleMatrix2D; +import org.apache.mahout.colt.matrix.DoubleMatrix3D; +import org.apache.mahout.colt.matrix.impl.AbstractFormatter; +import org.apache.mahout.colt.matrix.impl.AbstractMatrix1D; +import org.apache.mahout.colt.matrix.impl.AbstractMatrix2D; +import org.apache.mahout.colt.matrix.impl.DenseDoubleMatrix1D; +import org.apache.mahout.colt.matrix.impl.Former; +/** +Flexible, well human readable matrix print formatting; By default decimal point aligned. Build on top of the C-like sprintf functionality + provided by the Format class written by Cay Horstmann. + Currenly works on 1-d, 2-d and 3-d matrices. + Note that in most cases you will not need to get familiar with this class; just call matrix.toString() and be happy with the default formatting. + This class is for advanced requirements. +

Can't exactly remember the syntax of printf format strings? See Format + or Henrik + Nordberg's documentation, or the Dinkumware's + C Library Reference. + +

Examples: +

+Examples demonstrate usage on 2-d matrices. 1-d and 3-d matrices formatting works very similar. + + + + + + + +
Original matrix
+ +

double[][] values = {
+ {3, 0, -3.4, 0},
+ {5.1 ,0, +3.0123456789, 0},
+ {16.37, 0.0, 2.5, 0},
+ {-16.3, 0, -3.012345678E-4, -1},
+ {1236.3456789, 0, 7, -1.2}
+ };
+ matrix = new DenseDoubleMatrix2D(values);

+
+

 

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
formatFormatter.toString(matrix);Formatter.toSourceCode(matrix);
%G
+ (default)
5 x 4 matrix
+    3        0 -3.4       0  
+    5.1      0  3.012346  0  
+   16.37     0  2.5       0  
+  -16.3      0 -0.000301 -1  
+ 1236.345679 0  7        -1.2 +
{
+    {   3       , 0, -3.4     ,  0  },
+    {   5.1     , 0,  3.012346,  0  },
+    {  16.37    , 0,  2.5     ,  0  },
+    { -16.3     , 0, -0.000301, -1  },
+    {1236.345679, 0,  7       , -1.2}
+ };
%1.10G5 x 4 matrix
+    3         0 -3.4           0  
+    5.1       0  3.0123456789  0  
+   16.37      0  2.5           0  
+  -16.3       0 -0.0003012346 -1  
+ 1236.3456789 0  7            -1.2 +
{
+    {   3        , 0, -3.4         ,  0  },
+    {   5.1      , 0,  3.0123456789,  0  },
+    {  16.37     , 0,  2.5         ,  0  },
+    { -16.3      , 0, -0.0003012346, -1  },
+    {1236.3456789, 0,  7           , -1.2}
+ };
%f 5 x 4 matrix
+    3.000000 0.000000 -3.400000  0.000000
+    5.100000 0.000000  3.012346  0.000000
+   16.370000 0.000000  2.500000  0.000000
+  -16.300000 0.000000 -0.000301 -1.000000
+ 1236.345679 0.000000  7.000000 -1.200000
{
+    {   3.000000, 0.000000, -3.400000,  0.000000},
+    {   5.100000, 0.000000,  3.012346,  0.000000},
+    {  16.370000, 0.000000,  2.500000,  0.000000},
+    { -16.300000, 0.000000, -0.000301, -1.000000},
+    {1236.345679, 0.000000,  7.000000, -1.200000}
+ };
%1.2f5 x 4 matrix
+    3.00 0.00 -3.40  0.00
+    5.10 0.00  3.01  0.00
+   16.37 0.00  2.50  0.00
+  -16.30 0.00 -0.00 -1.00
+ 1236.35 0.00  7.00 -1.20
{
+    {   3.00, 0.00, -3.40,  0.00},
+    {   5.10, 0.00,  3.01,  0.00},
+    {  16.37, 0.00,  2.50,  0.00},
+    { -16.30, 0.00, -0.00, -1.00},
+    {1236.35, 0.00,  7.00, -1.20}
+ };
%0.2e5 x 4 matrix
+  3.00e+000 0.00e+000 -3.40e+000  0.00e+000
+  5.10e+000 0.00e+000  3.01e+000  0.00e+000
+  1.64e+001 0.00e+000  2.50e+000  0.00e+000
+ -1.63e+001 0.00e+000 -3.01e-004 -1.00e+000
+  1.24e+003 0.00e+000  7.00e+000 -1.20e+000
{
+    { 3.00e+000, 0.00e+000, -3.40e+000,  0.00e+000},
+    { 5.10e+000, 0.00e+000,  3.01e+000,  0.00e+000},
+    { 1.64e+001, 0.00e+000,  2.50e+000,  0.00e+000},
+    {-1.63e+001, 0.00e+000, -3.01e-004, -1.00e+000},
+    { 1.24e+003, 0.00e+000,  7.00e+000, -1.20e+000}
+ };
null5 x 4 matrix
+    3.0       0.0 -3.4             0.0
+    5.1       0.0  3.0123456789    0.0
+   16.37      0.0  2.5             0.0
+  -16.3       0.0 -3.012345678E-4 -1.0
+ 1236.3456789 0.0  7.0            -1.2 +
{
+    {   3.0      , 0.0, -3.4           ,  0.0},
+    {   5.1      , 0.0,  3.0123456789  ,  0.0},
+    {  16.37     , 0.0,  2.5           ,  0.0},
+    { -16.3      , 0.0, -3.012345678E-4, -1.0},
+    {1236.3456789, 0.0,  7.0           , -1.2}
+ };
+ +

Here are some more elaborate examples, adding labels for axes, rows, columns, + title and some statistical aggregations.

+ + + + + + + + + + + + + + + + + + + +
+

double[][] values = {
+ {5 ,10, 20, 40 },
+ { 7, 8 , 6 , 7 },
+ {12 ,10, 20, 19 },
+ { 3, 1 , 5 , 6 }
+ };
+
String title = "CPU performance over time [nops/sec]";
+ String columnAxisName = "Year";
+ String rowAxisName = "CPU";
+ String[] columnNames = {"1996", "1997", "1998", "1999"};
+ String[] rowNames = { "PowerBar", "Benzol", "Mercedes", "Sparcling"};
+ hep.aida.bin.BinFunctions1D F = hep.aida.bin.BinFunctions1D.functions; // alias
+ hep.aida.bin.BinFunction1D[] aggr = {F.mean, F.rms, F.quantile(0.25), F.median, F.quantile(0.75), F.stdDev, F.min, F.max};
+ String format = "%1.2G";
+ DoubleMatrix2D matrix = new DenseDoubleMatrix2D(values);
+ new Formatter(format).toTitleString(
+    matrix,rowNames,columnNames,rowAxisName,columnAxisName,title,aggr);
+

+
+CPU performance over time [nops/sec]
+            | Year
+            | 1996  1997  1998  1999  | Mean  RMS   25% Q. Median 75% Q. StdDev Min Max
+---------------------------------------------------------------------------------------
+C PowerBar  |  5    10    20    40    | 18.75 23.05  8.75  15     25     15.48   5  40 
+P Benzol    |  7     8     6     7    |  7     7.04  6.75   7      7.25   0.82   6   8 
+U Mercedes  | 12    10    20    19    | 15.25 15.85 11.5   15.5   19.25   4.99  10  20 
+  Sparcling |  3     1     5     6    |  3.75  4.21  2.5    4      5.25   2.22   1   6 
+---------------------------------------------------------------------------------------
+  Mean      |  6.75  7.25 12.75 18    |                                                
+  RMS       |  7.53  8.14 14.67 22.62 |                                                
+  25% Q.    |  4.5   6.25  5.75  6.75 |                                                
+  Median    |  6     9    13    13    |                                                
+  75% Q.    |  8.25 10    20    24.25 |                                                
+  StdDev    |  3.86  4.27  8.38 15.81 |                                                
+  Min       |  3     1     5     6    |                                                
+  Max       | 12    10    20    19    |                                                 +
+
same as above, but now without aggregations
+ aggr=null;
CPU performance over time [nops/sec]
+             | Year
+             | 1996 1997 1998 1999
+ ---------------------------------
+ C PowerBar  |  5   10   20   40  
+ P Benzol    |  7    8    6    7  
+ U Mercedes  | 12   10   20   19  
+   Sparcling |  3    1    5    6   +
+

same as above, but now without rows labeled
+ aggr=null;
+ rowNames=null;
+ rowAxisName=null;

+
+CPU performance over time [nops/sec]
+Year
+1996 1997 1998 1999
+-------------------
+ 5   10   20   40  
+ 7    8    6    7  
+12   10   20   19  
+ 3    1    5    6   +
+
+ +

A column can be broader than specified by the parameter minColumnWidth + (because a cell may not fit into that width) but a column is never smaller than + minColumnWidth. Normally one does not need to specify minColumnWidth + (default is 1). This parameter is only interesting when wanting to + print two distinct matrices such that both matrices have the same column width, + for example, to make it easier to see which column of matrix A corresponds to + which column of matrix B.

+ +

Implementation:

+ +

Note that this class is by no means ment to be used for high performance I/O (serialization is much quicker). + It is ment to produce well human readable output.

+

Analyzes the entire matrix before producing output. Each cell is converted + to a String as indicated by the given C-like format string. If null + is passed as format string, {@link java.lang.Double#toString(double)} is used + instead, yielding full precision.

+

Next, leading and trailing whitespaces are removed. For each column the maximum number of characters before + and after the decimal point is determined. (No problem if decimal points are + missing). Each cell is then padded with leading and trailing blanks, as necessary + to achieve decimal point aligned, left justified formatting.

+ +@author wolfgang.hoschek@cern.ch +@version 1.2, 11/30/99 +*/ +/** + * @deprecated until unit tests are in place. Until this time, this class/interface is unsupported. + */ +@Deprecated +public class Formatter extends AbstractFormatter { +/** + * Constructs and returns a matrix formatter with format "%G". + */ +public Formatter() { + this("%G"); +} +/** + * Constructs and returns a matrix formatter. + * @param format the given format used to convert a single cell value. + */ +public Formatter(String format) { + setFormat(format); + setAlignment(DECIMAL); +} +/** + * Demonstrates how to use this class. + */ +public static void demo1() { +// parameters +double[][] values = { + {3, 0, -3.4, 0}, + {5.1 ,0, +3.0123456789, 0}, + {16.37, 0.0, 2.5, 0}, + {-16.3, 0, -3.012345678E-4, -1}, + {1236.3456789, 0, 7, -1.2} +}; +String[] formats = {"%G", "%1.10G", "%f", "%1.2f", "%0.2e", null}; + + +// now the processing +int size = formats.length; +DoubleMatrix2D matrix = org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values); +String[] strings = new String[size]; +String[] sourceCodes = new String[size]; +String[] htmlStrings = new String[size]; +String[] htmlSourceCodes = new String[size]; + +for (int i=0; i= 0; ) { + for (int j=size; --j >= 0; ) { + buf.append(matrix.getQuick(i,j)); + } + } + buf = null; + timer.stop().display(); + + timer.reset().start(); + org.apache.mahout.colt.matrix.impl.Former format = new org.apache.mahout.colt.matrix.impl.FormerFactory().create("%G"); + buf = new StringBuffer(); + for (int i=size; --i >= 0; ) { + for (int j=size; --j >= 0; ) { + buf.append(format.form(matrix.getQuick(i,j))); + } + } + buf = null; + timer.stop().display(); + + timer.reset().start(); + s = new Formatter(null).toString(matrix); + //System.out.println(s); + s = null; + timer.stop().display(); + + timer.reset().start(); + s = new Formatter("%G").toString(matrix); + //System.out.println(s); + s = null; + timer.stop().display(); +} +/** + * Demonstrates how to use this class. + */ +public static void demo4() { +// parameters +double[][] values = { + {3, 0, -3.4, 0}, + {5.1 ,0, +3.0123456789, 0}, + {16.37, 0.0, 2.5, 0}, + {-16.3, 0, -3.012345678E-4, -1}, + {1236.3456789, 0, 7, -1.2} +}; +/* +double[][] values = { + {3, 1, }, + {5.1 ,16.37, } +}; +*/ +//String[] columnNames = { "he", "", "he", "four" }; +//String[] rowNames = { "hello", "du", null, "abcdef", "five" }; +String[] columnNames = { "0.1", "0.3", "0.5", "0.7" }; +String[] rowNames = { "SunJDK1.2.2 classic", "IBMJDK1.1.8", "SunJDK1.3 Hotspot", "other1", "other2" }; +//String[] columnNames = { "0.1", "0.3" }; +//String[] rowNames = { "SunJDK1.2.2 classic", "IBMJDK1.1.8"}; + +DoubleMatrix2D matrix = org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values); +System.out.println("\n\n"+new Formatter("%G").toTitleString(matrix,rowNames,columnNames,"rowAxis","colAxis","VM Performance: Provider vs. matrix density")); +} +/** + * Demonstrates how to use this class. + */ +public static void demo5() { +// parameters +double[][] values = { + {3, 0, -3.4, 0}, + {5.1 ,0, +3.0123456789, 0}, + {16.37, 0.0, 2.5, 0}, + {-16.3, 0, -3.012345678E-4, -1}, + {1236.3456789, 0, 7, -1.2} +}; +/* +double[][] values = { + {3, 1, }, + {5.1 ,16.37, } +}; +*/ +//String[] columnNames = { "he", "", "he", "four" }; +//String[] rowNames = { "hello", "du", null, "abcdef", "five" }; +String[] columnNames = { "0.1", "0.3", "0.5", "0.7" }; +String[] rowNames = { "SunJDK1.2.2 classic", "IBMJDK1.1.8", "SunJDK1.3 Hotspot", "other1", "other2" }; +//String[] columnNames = { "0.1", "0.3" }; +//String[] rowNames = { "SunJDK1.2.2 classic", "IBMJDK1.1.8"}; + +System.out.println(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values)); +System.out.println(new Formatter("%G").toTitleString(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values),rowNames,columnNames,"vendor","density","title")); +} +/** + * Demonstrates how to use this class. + */ +public static void demo6() { +// parameters +double[][] values = { + {3, 0, -3.4, 0}, + {5.1 ,0, +3.0123456789, 0}, + {16.37, 0.0, 2.5, 0}, + {-16.3, 0, -3.012345678E-4, -1}, + {1236.3456789, 0, 7, -1.2} +}; +/* +double[][] values = { + {3, 1, }, + {5.1 ,16.37, } +}; +*/ +//String[] columnNames = { "he", "", "he", "four" }; +//String[] rowNames = { "hello", "du", null, "abcdef", "five" }; +//String[] columnNames = { "0.1", "0.3", "0.5", "0.7" }; +String[] columnNames = { "W", "X", "Y", "Z"}; +String[] rowNames = { "SunJDK1.2.2 classic", "IBMJDK1.1.8", "SunJDK1.3 Hotspot", "other1", "other2" }; +//String[] columnNames = { "0.1", "0.3" }; +//String[] rowNames = { "SunJDK1.2.2 classic", "IBMJDK1.1.8"}; + +//System.out.println(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values)); +//System.out.println(new Formatter().toSourceCode(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values))); +System.out.println(new Formatter().toString(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values))); +System.out.println(new Formatter().toTitleString(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values),rowNames,columnNames,"vendor","density","title")); +} +/** + * Demonstrates how to use this class. + */ +public static void demo7() { +// parameters +/* +double[][] values = { + {3, 0, -3.4, 0}, + {5.1 ,0, +3.0123456789, 0}, + {16.37, 0.0, 2.5, 0}, + {-16.3, 0, -3.012345678E-4, -1}, + {1236.3456789, 0, 7, -1.2} +}; +*/ +double[][] values = { + {5 ,10, 20, 40 }, + { 7, 8 , 6 , 7 }, + {12 ,10, 20, 19 }, + { 3, 1 , 5 , 6 } +}; +String[] columnNames = {"1996", "1997", "1998", "1999"}; +String[] rowNames = { "PowerBar", "Benzol", "Mercedes", "Sparcling"}; +String rowAxisName = "CPU"; +String columnAxisName = "Year"; +String title = "CPU performance over time [nops/sec]"; +//hep.aida.bin.BinFunctions1D F = hep.aida.bin.BinFunctions1D.functions; +//hep.aida.bin.BinFunction1D[] aggr = {F.mean, F.rms, F.quantile(0.25), F.median,F.quantile(0.75), F.stdDev, F.min, F.max}; +String format = "%1.2G"; + +//String[] columnNames = { "W", "X", "Y", "Z", "mean", "median", "sum"}; +//String[] rowNames = { "SunJDK1.2.2 classic", "IBMJDK1.1.8", "SunJDK1.3 Hotspot", "other1", "other2", "mean", "median", "sum" }; +//hep.aida.bin.BinFunction1D[] aggr = {F.mean, F.median, F.sum}; + +//System.out.println(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values)); +//System.out.println(new Formatter().toSourceCode(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values))); +//System.out.println(new Formatter().toString(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values))); +//System.out.println(new Formatter().toTitleString(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values),rowNames,columnNames,rowAxisName,columnAxisName,title)); +//System.out.println(new Formatter(format).toTitleString(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values),rowNames,columnNames,rowAxisName,columnAxisName,title, aggr)); +//System.out.println(org.apache.mahout.colt.matrixpattern.Converting.toHTML(new Formatter(format).toTitleString(org.apache.mahout.colt.matrix.DoubleFactory2D.dense.make(values),rowNames,columnNames,rowAxisName,columnAxisName,title, aggr))); +} +/** + * Converts a given cell to a String; no alignment considered. + */ +protected String form(DoubleMatrix1D matrix, int index, Former formatter) { + return formatter.form(matrix.get(index)); +} +/** + * Converts a given cell to a String; no alignment considered. + */ +protected String form(AbstractMatrix1D matrix, int index, Former formatter) { + return this.form((DoubleMatrix1D) matrix, index, formatter); +} +/** + * Returns a string representations of all cells; no alignment considered. + */ +public String[][] format(DoubleMatrix2D matrix) { + String[][] strings = new String[matrix.rows()][matrix.columns()]; + for (int row=matrix.rows(); --row >= 0; ) strings[row] = formatRow(matrix.viewRow(row)); + return strings; +} +/** + * Returns a string representations of all cells; no alignment considered. + */ +protected String[][] format(AbstractMatrix2D matrix) { + return this.format((DoubleMatrix2D) matrix); +} +/** + * Returns the index of the decimal point. + */ +protected int indexOfDecimalPoint(String s) { + int i = s.lastIndexOf('.'); + if (i<0) i = s.lastIndexOf('e'); + if (i<0) i = s.lastIndexOf('E'); + if (i<0) i = s.length(); + return i; +} +/** + * Returns the number of characters before the decimal point. + */ +protected int lead(String s) { + if (alignment.equals(DECIMAL)) return indexOfDecimalPoint(s); + return super.lead(s); +} +/** + * Returns a string s such that Object[] m = s is a legal Java statement. + * @param matrix the matrix to format. + */ +public String toSourceCode(DoubleMatrix1D matrix) { + Formatter copy = (Formatter) this.clone(); + copy.setPrintShape(false); + copy.setColumnSeparator(", "); + String lead = "{"; + String trail = "};"; + return lead + copy.toString(matrix) + trail; +} +/** + * Returns a string s such that Object[] m = s is a legal Java statement. + * @param matrix the matrix to format. + */ +public String toSourceCode(DoubleMatrix2D matrix) { + Formatter copy = (Formatter) this.clone(); + String b3 = blanks(3); + copy.setPrintShape(false); + copy.setColumnSeparator(", "); + copy.setRowSeparator("},\n"+b3+"{"); + String lead = "{\n"+b3+"{"; + String trail = "}\n};"; + return lead + copy.toString(matrix) + trail; +} +/** + * Returns a string s such that Object[] m = s is a legal Java statement. + * @param matrix the matrix to format. + */ +public String toSourceCode(DoubleMatrix3D matrix) { + Formatter copy = (Formatter) this.clone(); + String b3 = blanks(3); + String b6 = blanks(6); + copy.setPrintShape(false); + copy.setColumnSeparator(", "); + copy.setRowSeparator("},\n"+b6+"{"); + copy.setSliceSeparator("}\n"+b3+"},\n"+b3+"{\n"+b6+"{"); + String lead = "{\n"+b3+"{\n"+b6+"{"; + String trail = "}\n"+b3+"}\n}"; + return lead + copy.toString(matrix) + trail; +} +/** + * Returns a string representation of the given matrix. + * @param matrix the matrix to convert. + */ +public String toString(DoubleMatrix1D matrix) { + DoubleMatrix2D easy = matrix.like2D(1,matrix.size()); + easy.viewRow(0).assign(matrix); + return toString(easy); +} +/** + * Returns a string representation of the given matrix. + * @param matrix the matrix to convert. + */ +public String toString(DoubleMatrix2D matrix) { + return super.toString(matrix); +} +/** + * Returns a string representation of the given matrix. + * @param matrix the matrix to convert. + */ +public String toString(DoubleMatrix3D matrix) { + StringBuffer buf = new StringBuffer(); + boolean oldPrintShape = this.printShape; + this.printShape = false; + for (int slice=0; slice < matrix.slices(); slice++) { + if (slice!=0) buf.append(sliceSeparator); + buf.append(toString(matrix.viewSlice(slice))); + } + this.printShape = oldPrintShape; + if (printShape) buf.insert(0,shape(matrix) + "\n"); + return buf.toString(); +} +/** + * Returns a string representation of the given matrix. + * @param matrix the matrix to convert. + */ +protected String toString(AbstractMatrix2D matrix) { + return this.toString((DoubleMatrix2D) matrix); +} +/** +Returns a string representation of the given matrix with axis as well as rows and columns labeled. +Pass null to one or more parameters to indicate that the corresponding decoration element shall not appear in the string converted matrix. + +@param matrix The matrix to format. +@param rowNames The headers of all rows (to be put to the left of the matrix). +@param columnNames The headers of all columns (to be put to above the matrix). +@param rowAxisName The label of the y-axis. +@param columnAxisName The label of the x-axis. +@param title The overall title of the matrix to be formatted. +@return the matrix converted to a string. +*/ +protected String toTitleString(DoubleMatrix2D matrix, String[] rowNames, String[] columnNames, String rowAxisName, String columnAxisName, String title) { + if (matrix.size()==0) return "Empty matrix"; + String[][] s = format(matrix); + //String oldAlignment = this.alignment; + //this.alignment = DECIMAL; + align(s); + //this.alignment = oldAlignment; + return new org.apache.mahout.colt.matrix.objectalgo.Formatter().toTitleString(org.apache.mahout.colt.matrix.ObjectFactory2D.dense.make(s), rowNames,columnNames,rowAxisName,columnAxisName,title); +} +/** +Same as toTitleString except that additionally statistical aggregates (mean, median, sum, etc.) of rows and columns are printed. +Pass null to one or more parameters to indicate that the corresponding decoration element shall not appear in the string converted matrix. + +@param matrix The matrix to format. +@param rowNames The headers of all rows (to be put to the left of the matrix). +@param columnNames The headers of all columns (to be put to above the matrix). +@param rowAxisName The label of the y-axis. +@param columnAxisName The label of the x-axis. +@param title The overall title of the matrix to be formatted. +@param aggr the aggregation functions to be applied to columns and rows. +@return the matrix converted to a string. +@see hep.aida.bin.BinFunction1D +@see hep.aida.bin.BinFunctions1D + +public String toTitleString(DoubleMatrix2D matrix, String[] rowNames, String[] columnNames, String rowAxisName, String columnAxisName, String title, hep.aida.bin.BinFunction1D[] aggr) { + if (matrix.size()==0) return "Empty matrix"; + if (aggr==null || aggr.length==0) return toTitleString(matrix,rowNames,columnNames,rowAxisName,columnAxisName,title); + + DoubleMatrix2D rowStats = matrix.like(matrix.rows(), aggr.length); // hold row aggregations + DoubleMatrix2D colStats = matrix.like(aggr.length, matrix.columns()); // hold column aggregations + + org.apache.mahout.colt.matrix.doublealgo.Statistic.aggregate(matrix, aggr, colStats); // aggregate an entire column at a time + org.apache.mahout.colt.matrix.doublealgo.Statistic.aggregate(matrix.viewDice(), aggr, rowStats.viewDice()); // aggregate an entire row at a time + + // turn into strings + // tmp holds "matrix" plus "colStats" below (needed so that numbers in a columns can be decimal point aligned) + DoubleMatrix2D tmp = matrix.like(matrix.rows()+aggr.length, matrix.columns()); + tmp.viewPart(0,0,matrix.rows(),matrix.columns()).assign(matrix); + tmp.viewPart(matrix.rows(),0,aggr.length,matrix.columns()).assign(colStats); + colStats = null; + + String[][] s1 = format(tmp); align(s1); tmp = null; + String[][] s2 = format(rowStats); align(s2); rowStats = null; + + // copy strings into a large matrix holding the source matrix and all aggregations + org.apache.mahout.colt.matrix.ObjectMatrix2D allStats = org.apache.mahout.colt.matrix.ObjectFactory2D.dense.make(matrix.rows()+aggr.length, matrix.columns()+aggr.length+1); + allStats.viewPart(0,0,matrix.rows()+aggr.length,matrix.columns()).assign(s1); + allStats.viewColumn(matrix.columns()).assign("|"); + allStats.viewPart(0,matrix.columns()+1,matrix.rows(),aggr.length).assign(s2); + s1 = null; s2 = null; + + // append a vertical "|" separator plus names of aggregation functions to line holding columnNames + if (columnNames!=null) { + org.apache.mahout.colt.list.ObjectArrayList list = new org.apache.mahout.colt.list.ObjectArrayList(columnNames); + list.add("|"); + for (int i=0; inull to one or more parameters to indicate that the corresponding decoration element shall not appear in the string converted matrix. + +@param matrix The matrix to format. +@param sliceNames The headers of all slices (to be put above each slice). +@param rowNames The headers of all rows (to be put to the left of the matrix). +@param columnNames The headers of all columns (to be put to above the matrix). +@param sliceAxisName The label of the z-axis (to be put above each slice). +@param rowAxisName The label of the y-axis. +@param columnAxisName The label of the x-axis. +@param title The overall title of the matrix to be formatted. +@param aggr the aggregation functions to be applied to columns, rows. +@return the matrix converted to a string. +@see hep.aida.bin.BinFunction1D +@see hep.aida.bin.BinFunctions1D + +public String toTitleString(DoubleMatrix3D matrix, String[] sliceNames, String[] rowNames, String[] columnNames, String sliceAxisName, String rowAxisName, String columnAxisName, String title, hep.aida.bin.BinFunction1D[] aggr) { + if (matrix.size()==0) return "Empty matrix"; + StringBuffer buf = new StringBuffer(); + for (int i=0; i + * Performance + *

+ * Partitioning into two intervals is O( N ). + * Partitioning into k intervals is O( N * log(k)). + * Constants factors are minimized. + * + * @see org.apache.mahout.colt.Partitioning "Partitioning arrays (provides more documentation)" + * + * @author wolfgang.hoschek@cern.ch + * @version 1.0, 09/24/99 + */ +/** + * @deprecated until unit tests are in place. Until this time, this class/interface is unsupported. + */ +@Deprecated +public class Partitioning extends Object { +/** + * Makes this class non instantiable, but still let's others inherit from it. + */ +protected Partitioning() {} +/** +Same as {@link org.apache.mahout.colt.Partitioning#partition(int[],int,int,int[],int,int,int[])} +except that it synchronously partitions the rows of the given matrix by the values of the given matrix column; +This is essentially the same as partitioning a list of composite objects by some instance variable; +In other words, two entire rows of the matrix are swapped, whenever two column values indicate so. +

+Let's say, a "row" is an "object" (tuple, d-dimensional point). +A "column" is the list of "object" values of a given variable (field, dimension). +A "matrix" is a list of "objects" (tuples, points). +

+Now, rows (objects, tuples) are partially sorted according to their values in one given variable (dimension). +Two entire rows of the matrix are swapped, whenever two column values indicate so. +

+Note that arguments are not checked for validity. +

+Example: + + + + + + +
8 x 3 matrix:
+ 23, 22, 21
+ 20, 19, 18
+ 17, 16, 15
+ 14, 13, 12
+ 11, 10, 9
+ 8, 7, 6
+ 5, 4, 3
+ 2, 1, 0
+

column = 0;
+ rowIndexes = {0,1,2,..,matrix.rows()-1}; + rowFrom = 0;
+ rowTo = matrix.rows()-1;
+ splitters = {5,10,12}
+ c = 0;
+ d = splitters.length-1;
+ partition(matrix,rowIndexes,rowFrom,rowTo,column,splitters,c,d,splitIndexes);
+ ==>
+ splitIndexes == {0, 2, 3}
+ rowIndexes == {7, 6, 5, 4, 0, 1, 2, 3}

+
+ The matrix IS NOT REORDERED.
+ Here is how it would look
+ like, if it would be reordered
+ accoring to rowIndexes.
+ 8 x 3 matrix:
+ 2, 1, 0
+ 5, 4, 3
+ 8, 7, 6
+ 11, 10, 9
+ 23, 22, 21
+ 20, 19, 18
+ 17, 16, 15
+ 14, 13, 12
+@param matrix the matrix to be partitioned. +@param rowIndexes the index of the i-th row; is modified by this method to reflect partitioned indexes. +@param rowFrom the index of the first row (inclusive). +@param rowTo the index of the last row (inclusive). +@param column the index of the column to partition on. +@param splitters the values at which the rows shall be split into intervals. + Must be sorted ascending and must not contain multiple identical values. + These preconditions are not checked; be sure that they are met. + +@param splitFrom the index of the first splitter element to be considered. +@param splitTo the index of the last splitter element to be considered. + The method considers the splitter elements splitters[splitFrom] .. splitters[splitTo]. + +@param splitIndexes a list into which this method fills the indexes of rows delimiting intervals. +Upon return splitIndexes[splitFrom..splitTo] will be set accordingly. +Therefore, must satisfy splitIndexes.length >= splitters.length. +*/ +public static void partition(DoubleMatrix2D matrix, int[] rowIndexes, int rowFrom, int rowTo, int column, final double[] splitters, int splitFrom, int splitTo, int[] splitIndexes) { + if (rowFrom < 0 || rowTo >= matrix.rows() || rowTo >= rowIndexes.length) throw new IllegalArgumentException(); + if (column < 0 || column >= matrix.columns()) throw new IllegalArgumentException(); + if (splitFrom < 0 || splitTo >= splitters.length) throw new IllegalArgumentException(); + if (splitIndexes.length < splitters.length) throw new IllegalArgumentException(); + + // this one knows how to swap two row indexes (a,b) + final int[] g = rowIndexes; + Swapper swapper = new Swapper() { + public void swap(int b, int c) { + int tmp = g[b]; g[b] = g[c]; g[c] = tmp; + } + }; + + // compare splitter[a] with columnView[rowIndexes[b]] + final DoubleMatrix1D columnView = matrix.viewColumn(column); + IntComparator comp = new IntComparator() { + public int compare(int a, int b) { + double av = splitters[a]; + double bv = columnView.getQuick(g[b]); + return avsynchronously partitions the rows of the given matrix by the values of the given matrix column; +This is essentially the same as partitioning a list of composite objects by some instance variable; +In other words, two entire rows of the matrix are swapped, whenever two column values indicate so. +

+Let's say, a "row" is an "object" (tuple, d-dimensional point). +A "column" is the list of "object" values of a given variable (field, dimension). +A "matrix" is a list of "objects" (tuples, points). +

+Now, rows (objects, tuples) are partially sorted according to their values in one given variable (dimension). +Two entire rows of the matrix are swapped, whenever two column values indicate so. +

+Note that arguments are not checked for validity. +

+Example: + + + + + + +
8 x 3 matrix:
+ 23, 22, 21
+ 20, 19, 18
+ 17, 16, 15
+ 14, 13, 12
+ 11, 10, 9
+ 8, 7, 6
+ 5, 4, 3
+ 2, 1, 0
+ column = 0;
+ splitters = {5,10,12}
+ partition(matrix,column,splitters,splitIndexes);
+ ==>
+ splitIndexes == {0, 2, 3}

+
+ The matrix IS NOT REORDERED.
+ The new VIEW IS REORDERED:
+ 8 x 3 matrix:
+ 2, 1, 0
+ 5, 4, 3
+ 8, 7, 6
+ 11, 10, 9
+ 23, 22, 21
+ 20, 19, 18
+ 17, 16, 15
+ 14, 13, 12
+@param matrix the matrix to be partitioned. +@param column the index of the column to partition on. +@param splitters the values at which the rows shall be split into intervals. + Must be sorted ascending and must not contain multiple identical values. + These preconditions are not checked; be sure that they are met. + +@param splitIndexes a list into which this method fills the indexes of rows delimiting intervals. +Therefore, must satisfy splitIndexes.length >= splitters.length. + +@return a new matrix view having rows partitioned by the given column and splitters. +*/ +public static DoubleMatrix2D partition(DoubleMatrix2D matrix, int column, final double[] splitters, int[] splitIndexes) { + int rowFrom = 0; + int rowTo = matrix.rows()-1; + int splitFrom = 0; + int splitTo = splitters.length-1; + int[] rowIndexes = new int[matrix.rows()]; // row indexes to reorder instead of matrix itself + for (int i=rowIndexes.length; --i >= 0; ) rowIndexes[i] = i; + + partition(matrix,rowIndexes,rowFrom,rowTo,column,splitters,splitFrom,splitTo,splitIndexes); + + // take all columns in the original order + int[] columnIndexes = new int[matrix.columns()]; + for (int i=columnIndexes.length; --i >= 0; ) columnIndexes[i] = i; + + // view the matrix according to the reordered row indexes + return matrix.viewSelection(rowIndexes,columnIndexes); +} +/** +Same as {@link #partition(int[],int,int,int[],int,int,int[])} +except that it synchronously partitions the rows of the given matrix by the values of the given matrix column; +This is essentially the same as partitioning a list of composite objects by some instance variable; +In other words, two entire rows of the matrix are swapped, whenever two column values indicate so. +

+Let's say, a "row" is an "object" (tuple, d-dimensional point). +A "column" is the list of "object" values of a given variable (field, dimension). +A "matrix" is a list of "objects" (tuples, points). +

+Now, rows (objects, tuples) are partially sorted according to their values in one given variable (dimension). +Two entire rows of the matrix are swapped, whenever two column values indicate so. +

+Of course, the column must not be a column of a different matrix. +More formally, there must hold:
+There exists an i such that matrix.viewColumn(i)==column. +

+Note that arguments are not checked for validity. +

+Example: + + + + + + +
8 x 3 matrix:
+ 23, 22, 21
+ 20, 19, 18
+ 17, 16, 15
+ 14, 13, 12
+ 11, 10, 9
+ 8, 7, 6
+ 5, 4, 3
+ 2, 1, 0
+

column = matrix.viewColumn(0);
+ a = 0;
+ b = column.size()-1;

+ splitters={5,10,12}
+ c=0;
+ d=splitters.length-1;

+ partition(matrix,column,a,b,splitters,c,d,splitIndexes);
+ ==>
+ splitIndexes == {0, 2, 3}

+
8 x 3 matrix:
+ 2, 1, 0
+ 5, 4, 3
+ 8, 7, 6
+ 11, 10, 9
+ 23, 22, 21
+ 20, 19, 18
+ 17, 16, 15
+ 14, 13, 12
+*/ +private static void xPartitionOld(DoubleMatrix2D matrix, DoubleMatrix1D column, int from, int to, double[] splitters, int splitFrom, int splitTo, int[] splitIndexes) { + /* + double splitter; // int, double --> template type dependent + + if (splitFrom>splitTo) return; // nothing to do + if (from>to) { // all bins are empty + from--; + for (int i = splitFrom; i<=splitTo; ) splitIndexes[i++] = from; + return; + } + + // Choose a partition (pivot) index, m + // Ideally, the pivot should be the median, because a median splits a list into two equal sized sublists. + // However, computing the median is expensive, so we use an approximation. + int medianIndex; + if (splitFrom==splitTo) { // we don't really have a choice + medianIndex = splitFrom; + } + else { // we do have a choice + int m = (from+to) / 2; // Small arrays, middle element + int len = to-from+1; + if (len > SMALL) { + int l = from; + int n = to; + if (len > MEDIUM) { // Big arrays, pseudomedian of 9 + int s = len/8; + l = med3(column, l, l+s, l+2*s); + m = med3(column, m-s, m, m+s); + n = med3(column, n-2*s, n-s, n); + } + m = med3(column, l, m, n); // Mid-size, pseudomedian of 3 + } + + // Find the splitter closest to the pivot, i.e. the splitter that best splits the list into two equal sized sublists. + medianIndex = org.apache.mahout.colt.Sorting.binarySearchFromTo(splitters,column.getQuick(m),splitFrom,splitTo); + if (medianIndex < 0) medianIndex = -medianIndex - 1; // not found + if (medianIndex > splitTo) medianIndex = splitTo; // not found, one past the end + + } + splitter = splitters[medianIndex]; + + // Partition the list according to the splitter, i.e. + // Establish invariant: list[i] < splitter <= list[j] for i=from..medianIndex and j=medianIndex+1 .. to + int splitIndex = xPartitionOld(matrix,column,from,to,splitter); + splitIndexes[medianIndex] = splitIndex; + + // Optimization: Handle special cases to cut down recursions. + if (splitIndex < from) { // no element falls into this bin + // all bins with splitters[i] <= splitter are empty + int i = medianIndex-1; + while (i>=splitFrom && (!(splitter < splitters[i]))) splitIndexes[i--] = splitIndex; + splitFrom = medianIndex+1; + } + else if (splitIndex >= to) { // all elements fall into this bin + // all bins with splitters[i] >= splitter are empty + int i = medianIndex+1; + while (i<=splitTo && (!(splitter > splitters[i]))) splitIndexes[i++] = splitIndex; + splitTo = medianIndex-1; + } + + // recursively partition left half + if (splitFrom <= medianIndex-1) { + xPartitionOld(matrix, column, from, splitIndex, splitters, splitFrom, medianIndex-1, splitIndexes); + } + + // recursively partition right half + if (medianIndex+1 <= splitTo) { + xPartitionOld(matrix, column, splitIndex+1, to, splitters, medianIndex+1, splitTo, splitIndexes); + } + */ +} +/** + * Same as {@link #partition(int[],int,int,int)} + * except that it synchronously partitions the rows of the given matrix by the values of the given matrix column; + * This is essentially the same as partitioning a list of composite objects by some instance variable; + * In other words, two entire rows of the matrix are swapped, whenever two column values indicate so. + *

+ * Let's say, a "row" is an "object" (tuple, d-dimensional point). + * A "column" is the list of "object" values of a given variable (field, dimension). + * A "matrix" is a list of "objects" (tuples, points). + *

+ * Now, rows (objects, tuples) are partially sorted according to their values in one given variable (dimension). + * Two entire rows of the matrix are swapped, whenever two column values indicate so. + *

+ * Of course, the column must not be a column of a different matrix. + * More formally, there must hold:
+ * There exists an i such that matrix.viewColumn(i)==column. + * + * Note that arguments are not checked for validity. + */ +private static int xPartitionOld(DoubleMatrix2D matrix, DoubleMatrix1D column, int from, int to, double splitter) { + /* + double element; // int, double --> template type dependent + for (int i=from-1; ++i<=to; ) { + element = column.getQuick(i); + if (element < splitter) { + // swap x[i] with x[from] + matrix.swapRows(i,from); + from++; + } + } + return from-1; + */ + return 0; +} +} Propchange: lucene/mahout/trunk/matrix/src/main/java/org/apache/mahout/matrix/matrix/doublealgo/Partitioning.java ------------------------------------------------------------------------------ svn:eol-style = native