mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Understanding SVD CLI inputs
Date Sun, 08 Aug 2010 20:47:53 GMT
Also, cleansvd appears to be spewing a bunch of numbers (something about largestCleanEigens)
to the log.  It's almost completely unreadable.  Any objection to me making it debug at a
minimum?  Or, can it be removed?

On Aug 8, 2010, at 3:35 PM, Grant Ingersoll wrote:

> Just to make sure I'm understanding, the docs for "clean SVD" at https://cwiki.apache.org/confluence/display/MAHOUT/Dimensional+Reduction
are not correct, right?
> 
> In looking at the code, the SVD command requires --Dmapred.input.dir (soon to be --input
like everything else, see MAHOUT-461) a --tempDir and --Dmapred.output.dir (soon to be --output).
 Then, in the cleansvd command, the --eigenInput should actually refer to the Output directory
not the tempDir as the docs suggest, right?
> 
> Also, any recommendations on setting maxError and minEigenValue?  What are the tradeoffs
I'm making there?  I mean, I suppose maxError is some measure of convergence and minEigenValue
is just as it sounds, but what are the practical implications of those settings?  Are the
values in the example good starting points?
> 
> Thanks,
> Grant



Mime
View raw message