mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shannon Quinn (Commented) (JIRA)" <>
Subject [jira] [Commented] (MAHOUT-524) DisplaySpectralKMeans example fails
Date Sun, 23 Oct 2011 00:12:32 GMT


Shannon Quinn commented on MAHOUT-524:

If there are two DLS.runJob() methods and the spectral code is the only bit of code that calls
one of the two runJob() methods, then in the interest of making the codebase just a tiny bit
more maintainable I would vote for switching out the runJob() invoked by the spectral code
and deleting the other one in DLS entirely.

Regarding your tracing of the DRM.times() method, I was having the same problem: the fact
that there exist so many chained job constructors makes it difficult to follow. Is there any
way we could simplify TimesSquaredJob? Are each of those job creation methods called multiple
times throughout the code base?

Regarding this issue, it sounds like the problem either resides in TimesSquared not correctly
setting the path as you mentioned (but this begs the question why no other algorithm which
uses DRM.times() is running into the same problem), or the Configuration voodoo in SKMD is
causing problems.

I'll investigate the manipulation of Configuration objects in SKMD this week. If you have
any thoughts on the other points, please let me know.
> DisplaySpectralKMeans example fails
> -----------------------------------
>                 Key: MAHOUT-524
>                 URL:
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.4, 0.5
>            Reporter: Jeff Eastman
>            Assignee: Shannon Quinn
>              Labels: clustering, k-means, visualization
>             Fix For: 0.6
>         Attachments: EclipseLog_20110918.txt, SpectralKMeans_fail_20110919.txt, aff.txt,
raw.txt, spectralkmeans.png
> I've committed a new display example that attempts to push the standard mixture of models
data set through spectral k-means. After some tweaking of configuration arguments and a bug
fix in EigenCleanupJob it runs spectral k-means to completion. The display example is expecting
2-d clustered points and the example is producing 5-d points. Additional I/O work is needed
before this will play with the rest of the clustering algorithms. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message