mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robin Anil (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAHOUT-148) Convert Classification Algs to use richer Writable syntax
Date Wed, 07 Oct 2009 12:41:31 GMT

     [ https://issues.apache.org/jira/browse/MAHOUT-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robin Anil updated MAHOUT-148:
------------------------------

    Attachment: MAHOUT-148.patch

Verified by running all combinations of

Bayes|CBayes
hdfs|hbase 
sequential|mapreduce
both Training and Testing.

Noticed a slight improvement in running time of various map/reduce jobs (20% decrease for
20newsgroups dataset)



> Convert Classification Algs to use richer Writable syntax
> ---------------------------------------------------------
>
>                 Key: MAHOUT-148
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-148
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification
>    Affects Versions: 0.1, 0.2
>            Reporter: Grant Ingersoll
>            Assignee: Robin Anil
>             Fix For: 0.2
>
>         Attachments: MAHOUT-148-Work-In-Progress.patch, MAHOUT-148.patch
>
>
> Much of the classification capabilities relies on parsing values out from the Text object
just to determine what type of "thing" is being used.  We should try to avoid having to do
string manipulation for this kind of thing and instead encapsulate it in Writable instances.
 This should make things perform faster and bring stronger typing to the problem, which should
make it easier to understand and debug the code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message