mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-1493) Port Naive Bayes to the Spark DSL
Date Sun, 05 Apr 2015 22:35:07 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14425996#comment-14425996
] 

ASF GitHub Bot commented on MAHOUT-1493:
----------------------------------------

GitHub user andrewpalumbo opened a pull request:

    https://github.com/apache/mahout/pull/111

    MAHOUT-1493:  Add CLI options for --overwrite and --alphaI  to NB Drivers 

    Presently `mahout spark-trainnb` will not complete if a model already exists in the output
directory.  These last options add in an `--overwrite` option to overwrite a model in the
given output directory.  
    
    as well:
      - add `.par(auto = true)` to the input Drm
      - ads a `delete(...)` method to `Hadppo1HDFSUtils` which does not handle any IO exceptions
      - adds an almost trivial  `--alphaI` option to set the Laplace smoothing factor from
the CLI
    
    This patch will complete the full port of the old MapReduce Naive Bayes to the `math-scala`
and `spark` modules.    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewpalumbo/mahout MAHOUT-1493g

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/mahout/pull/111.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #111
    
----
commit 37f40e1ba32871227cc025effe87b135b5c28f31
Author: Andrew Palumbo <apalumbo@apache.org>
Date:   2015-04-05T20:56:48Z

    set .par(auto = true) on input data in CLI Drivers

commit 149b0d0ef4ee5ea6e161b669f09d4fa2eeb2fff3
Author: Andrew Palumbo <apalumbo@apache.org>
Date:   2015-04-05T22:12:27Z

    add CLI driver options for -ow and -alphaI.  added a delete(...) method in hHadoop1HDFSUtil.

commit 59ca8c1b2e960b5bfc10de39d5cd8e2bb0042c12
Author: Andrew Palumbo <apalumbo@apache.org>
Date:   2015-04-05T22:14:58Z

    Adjust Example accordingly

----


> Port Naive Bayes to the Spark DSL
> ---------------------------------
>
>                 Key: MAHOUT-1493
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1493
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>            Reporter: Sebastian Schelter
>            Assignee: Andrew Palumbo
>              Labels: DSL, h2o, scala
>             Fix For: 0.10.0
>
>         Attachments: MAHOUT-1493.patch, MAHOUT-1493.patch, MAHOUT-1493.patch, MAHOUT-1493.patch,
MAHOUT-1493a.patch
>
>
> Port our Naive Bayes implementation to the new spark dsl. Shouldn't require more than
a few lines of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message