mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] [Updated] (MAHOUT-763) Map-Side Distance Comparison
Date Sat, 16 Jul 2011 12:44:59 GMT


Sean Owen updated MAHOUT-763:

    Attachment: SeedVectorUtil.patch

This is what I had in mind -- it looks like more change than it is due to whitespace. The
key is just letting it take care of iterating over files in subdirs. Just a little tidier,
not a big deal either way.

I can make a similar change in kmeans.

> Map-Side Distance Comparison
> ----------------------------
>                 Key: MAHOUT-763
>                 URL:
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 0.6
>         Attachments: MAHOUT-763.patch, MAHOUT-763.patch, MAHOUT-763.patch, MAHOUT-763.patch,
> KMeans currently on the map-side calculates the distance between a set of seeds and all
other vectors.  It would be handy to have a generalization of this that, given a set of vectors
that fits in memory (the seeds) and other points, emit <seed id, other id, distance>
according to the distance measure.  This is similar to the RowSimilarityJob, but much simpler
and not as general purpose.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message