mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: PCA with ssvd leads to StackOverFlowError
Date Tue, 04 Mar 2014 17:44:47 GMT
as for the stack trace, it looks like it doesn't agree with current trunk.
Again, i need to know which version you are running.

But from looking at current trunk, i don't really see how that may be
happening at the moment.


On Tue, Mar 4, 2014 at 9:40 AM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> It doesn't look like -us has been removed. At least i see it on the head
> of the trunk, SSVDCli.java, line 62:
>
>     addOption("uSigma", "us", "Compute U * Sigma", String.valueOf(false));
>
> i.e. short version(single dash) -us true, or long version(double-dash)
> --uSigma true. Can you check again with 0.9? thanks.
>
>
> On Tue, Mar 4, 2014 at 9:37 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:
>
>> Kevin, thanks for reporting this.
>>
>> Stack overflow error has not been known to happen to date. But i will
>> take a look. It looks like a bug in the mean computation code, given your
>> stack trace, although it may have been induced by some circumstances
>> specific to your deployment.
>>
>>  What version is it? 0.9?
>>
>> As for -us, it is not known to have been removed to me. If it were, it
>> happened without my knowledge. I will take a look at the trunk.
>>
>> -d
>>
>>
>> On Tue, Mar 4, 2014 at 5:53 AM, Kevin Moulart <kevinmoulart@gmail.com>wrote:
>>
>>> Hi,
>>>
>>> I'm trying to apply a PCA to reduce the dimension of a matrix of 1603
>>> columns and 100.000 to 30.000.000 lines using ssvd with the pca option,
>>> and
>>> I always get a StackOverflowError :
>>>
>>> Here is my command line :
>>> mahout ssvd -i /user/myUser/Echant100k -o /user/myUser/Echant/SVD100 -k
>>> 100
>>> -pca "true" -U "false" -V "false" -t 3 -ow
>>>
>>> I also tried to put "-us true" as mentionned in
>>>
>>> https://cwiki.apache.org/confluence/download/attachments/27832158/SSVD-CLI.pdf?version=18&modificationDate=1381347063000&api=v2but
>>> the option is not available anymore.
>>>
>>> The output of the previous command is :
>>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
>>> Running on hadoop, using /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop
>>> and HADOOP_CONF_DIR=/etc/hadoop/conf
>>> MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.5.0-job.jar
>>> 14/03/04 14:45:16 INFO common.AbstractJob: Command line arguments:
>>> {--abtBlockHeight=[200000], --blockHeight=[10000], --broadcast=[true],
>>> --computeU=[false], --computeV=[false], --endPhase=[2147483647],
>>> --input=[/user/myUser/Echant100k], --minSplitSize=[-1],
>>> --outerProdBlockHeight=[30000], --output=[/user/myUser/Echant/SVD100],
>>> --oversampling=[15], --overwrite=null, --pca=[true], --powerIter=[0],
>>> --rank=[100], --reduceTasks=[3], --startPhase=[0], --tempDir=[temp],
>>> --uHalfSigma=[false], --vHalfSigma=[false]}
>>> Exception in thread "main" java.lang.StackOverflowError
>>> at
>>>
>>> org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55)
>>>  at
>>>
>>> org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55)
>>> at
>>>
>>> org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55)
>>> ...
>>>
>>> I search online and didn't find a solution to my problem.
>>>
>>> Can you help me ?
>>>
>>> Thanks in advance,
>>>
>>> --
>>> Kévin Moulart
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message