mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Eastman (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAHOUT-294) Uniform API behavior for Jobs
Date Fri, 16 Jul 2010 17:46:50 GMT

     [ https://issues.apache.org/jira/browse/MAHOUT-294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeff Eastman updated MAHOUT-294:
--------------------------------

    Attachment: MAHOUT-294a.patch

Here's a stab at improving the testability of AbstractJob options parsing. It adds an argMap
variable in AbstractJob and adds new getOption() and hasOption() methods which encapsulate
the "--" prepending, avoiding additional constants. By factoring out ClusterDumper.addOptions()
as a public method it allows unit testing of the command line processing without invoking
the cluster dumper. We could require this in all subclasses by adding AbstractJob.run() and
calling a new abstract addOptions() from it. That will have broad impact on all drivers and
I have not done it in this patch.

As a further step, one could imagine moving all of the common options from DefaultOptionCreator
into AbstractJob. This would have all of the Mahout shared command line options in a single
place; improving consistency.

Comments on this approach are welcome. I'm gone for the weekend.

> Uniform API behavior for Jobs
> -----------------------------
>
>                 Key: MAHOUT-294
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-294
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification, Clustering, Collaborative Filtering, Frequent Itemset/Association
Rule Mining, Genetic Algorithms, Math, Utils
>    Affects Versions: 0.4
>            Reporter: Robin Anil
>             Fix For: 0.4
>
>         Attachments: MAHOUT-294.patch, MAHOUT-294.patch, MAHOUT-294a.patch
>
>
> * Move AbstractJob to common and convert all the Driver classes to extend that.
>    One suggestion is:
>    AlgorithmParams params = ParamsBuilder.build().withParam("-i", input).withParam("-o",
output)....
>    MyAlgorithmn.runJob(params) throws ParameterMissingException;
> * Give uniform command-line parameters for various algorithms.
>    e.g Currently distance measure is -d, -dm, -m at different places in clustering
> * Add a temp directory as a parameter http://www.lucidimagination.com/search/document/28a979aa62c02a1/who_owns_mahout_bucket_on_s3#ddb5855e8bdace45
> This issue will keep track of all discussion/patches related to the design and cleanup
of Mahout API

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message