mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Drew Farris (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAHOUT-291) Mahout Code Cleanup
Date Mon, 15 Feb 2010 14:27:27 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833806#action_12833806
] 

Drew Farris commented on MAHOUT-291:
------------------------------------

Thanks very much Robin for posting a patch to review. Things like KMeansClusterer.log.debug
-> log.debug look great herehere, and I'm ok with the whitespace oriented changes for the
most part, bu there are some cases where auto-code formatting is really making a hash of things:

e.g:

{code}
System.out.println("Generating " + num + " samples m=[" + mx + ", " + my + "] sd=[" + sdx
+ ", " + sdy + ']');
{code}

gets transformed to:

{code}
System.out.println("Generating " + num + " samples m=[" + mx + ", " + my + "] sd=[" + sdx
+ ", " + sdy
                        + ']');
{code}

which despite the 120 line length rule seems a little too strict IMHO. 

Also, a nicely formatted OptionBuilder is turned into something nasty and unreadable.

{code}
-    Option clustersOpt = obuilder
-        .withLongName("clusters")
-        .withRequired(true)
-        .withArgument(abuilder.withName("clusters").withMinimum(1).withMaximum(1).create())
-        .withDescription(
-          "The input centroids, as Vectors.  Must be a SequenceFile of Writable, Cluster/Canopy.
 "
-              + "If k is also specified, then a random set of vectors will be selected and
written out to this path first")
+    Option clustersOpt = obuilder.withLongName("clusters").withRequired(true).withArgument(
+      abuilder.withName("clusters").withMinimum(1).withMaximum(1).create()).withDescription(
+      "The input centroids, as Vectors.  Must be a SequenceFile of Writable, Cluster/Canopy.
 "
+          + "If k is also specified, then a random set of vectors will be selected and"
+          + "written out to this path first")
         .withShortName("c").create();
{code}

And things like the following, but honestly which of these is the greater sin? (From LDAInference)

{code}
-    double t = f
-               * (-1 / 12.0 + f
-                              * (1 / 120.0 + f
-                                             * (-1 / 252.0 + f
-                                                             * (1 / 240.0 + f
-                                                                            * (-1 / 132.0
+ f
-                                                                                        
   * (691 / 32760.0 + f
-                                                                                        
                      * (-1 / 12.0 + f * 3617.0 / 8160.0)))))));
+    double t = f * (-1 / 12.0 + f * (1 / 120.0 + f * (-1 / 252.0 
+        + f * (1 / 240.0 + f * (-1 / 132.0 + f * (691 / 32760.0 + f * (-1 / 12.0 + f * 3617.0
/ 8160.0)))))));
{code}

What's the best way to proceed from here given this?

> Mahout Code Cleanup
> -------------------
>
>                 Key: MAHOUT-291
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-291
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification, Clustering, Collaborative Filtering, Frequent Itemset/Association
Rule Mining, Genetic Algorithms, Math, Utils, Website
>    Affects Versions: 0.3
>            Reporter: Robin Anil
>            Assignee: Robin Anil
>            Priority: Minor
>             Fix For: 0.3
>
>         Attachments: MAHOUT-291.patch
>
>
> Code Cleanup
> Organize imports
> Remove space in blank lines
> make local variables final

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message