spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From BryanCutler <>
Subject [GitHub] spark pull request: [SPARK-13963][ML] Adding binary toggle param t...
Date Fri, 18 Mar 2016 21:47:02 GMT
GitHub user BryanCutler opened a pull request:

    [SPARK-13963][ML] Adding binary toggle param to HashingTF

    ## What changes were proposed in this pull request?
    Adding binary toggle parameter to ml.feature.HashingTF, as well as mllib.feature.HashingTF
since the former wraps this functionality.  This parameter, if true, will set non-zero valued
term counts to 1 to transform term count features to binary values that are well suited for
discrete probability models.
    ## How was this patch tested?
    Added unit tests for ML and MLlib

You can merge this pull request into a Git repository by running:

    $ git pull binary-param-HashingTF-SPARK-13963

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11832
commit a5ff3309c0d07e57177374133130803eb98ebffb
Author: Bryan Cutler <>
Date:   2016-03-18T21:19:19Z

    [SPARK-13963] Adding binary toggle to HashingTF in ml/mllib

commit 31097231769860b86d1d3234ebf7d4e95f96e5cb
Author: Bryan Cutler <>
Date:   2016-03-18T21:19:48Z

    Added unit test for HashingTF binary toggle

commit ca1436166a1292f92d72408c10cf606623b31bbd
Author: Bryan Cutler <>
Date:   2016-03-18T21:26:34Z

    fixed param description text


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message