mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Drew Farris (JIRA)" <>
Subject [jira] Resolved: (MAHOUT-401) Use NamedVector in seq2sparse
Date Fri, 02 Jul 2010 02:46:49 GMT


Drew Farris resolved MAHOUT-401.

         Assignee: Drew Farris
    Fix Version/s: 0.4
       Resolution: Fixed

Actually, most of this was committed as a part of MAHOUT-167 (committed in r952758) - the
only thing missing was the fix to PartialVectorMergeReducer, which I've committed.

> Use NamedVector in seq2sparse
> -----------------------------
>                 Key: MAHOUT-401
>                 URL:
>             Project: Mahout
>          Issue Type: Bug
>          Components: Utils
>    Affects Versions: 0.4
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>             Fix For: 0.4
>         Attachments: MAHOUT-401.patch, pv.patch
> In seq2sparse, TFIDFPartialVectorReducer and TFPartialVectorReducer should write NamedVectors.
It appears that a lack of labels on the vector input to k-means at least breaks the cluster-dumper
in the sense that it no longer prints the original document ids for points.
> See:
> I wonder if this is also an issue with the code that generates vectors from lucene indexes?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message