hivemall-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From takuti <>
Subject [GitHub] incubator-hivemall pull request #84: [WIP][HIVEMALL-19] Support DIMSUM for a...
Date Fri, 02 Jun 2017 22:29:52 GMT
GitHub user takuti opened a pull request:

    [WIP][HIVEMALL-19] Support DIMSUM for approx. all-pairs similarity

    ## What changes were proposed in this pull request?
    Support DIMSUM, Dimension Independent Matrix Square using MapReduce, for approximated
all-pairs similarity computation. It makes item-based CF more efficient.
    ## What type of PR is it?
    ## What is the Jira issue?
    ## How was this patch tested?
    - Unit tests
    - Manual tests on EMR
    ### TODO
    - [ ] Documentation
    - [ ] Evaluate on larger data e.g. MovieLens

You can merge this pull request into a Git repository by running:

    $ git pull DIMSUM

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #84
commit 1a661cef229a508655352c360a2890bd66da1ab0
Author: Takuya Kitazawa <>
Date:   2017-06-01T03:30:08Z

    Add `l2_norm` UDAF

commit c19abc5b8e603b65595346c6fb76329a09a1e02c
Author: Takuya Kitazawa <>
Date:   2017-06-01T09:10:16Z

    Implement DIMSUM mapper

commit 44367b29056752b32bbbd9601e9500fa6398e8ef
Author: Takuya Kitazawa <>
Date:   2017-06-02T01:58:40Z

    Make symmetric output (j, k), (k, j) configureable

commit a6e854c856ce3deef46e6b8b0293497d57e82901
Author: Takuya Kitazawa <>
Date:   2017-06-02T03:16:23Z

    Support string feature

commit 97cb91d8fef0cd2f85657a02bd9a2505d7551337
Author: Takuya Kitazawa <>
Date:   2017-06-02T03:28:22Z

    Fix so that default `gamma` is computed correctly

commit b42b65b1cb358a89cd402f90b5ec3d6c79ff465c
Author: Takuya Kitazawa <>
Date:   2017-06-02T07:04:25Z

    Add unit test for DIMSUMMapperUDTF


If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

View raw message