Mailing-List: contact issues-help@spark.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 5 Nov 2014 23:08:34 +0000 (UTC)
From: "Michelangelo D'Agostino (JIRA)" <jira@apache.org>
To: issues@spark.apache.org
Message-ID: <JIRA.12704480.1387623768000.428831.1415228914533@Atlassian.JIRA>
In-Reply-To: <JIRA.12704480.1387623768000@Atlassian.JIRA>
References: <JIRA.12704480.1387623768000@Atlassian.JIRA>
 <JIRA.12704480.1387623768993@arcas>
Subject: [jira] [Commented] (SPARK-1006) MLlib ALS gets stack overflow with
 too many iterations
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/SPARK-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14199314#comment-14199314 ] 

Michelangelo D'Agostino commented on SPARK-1006:
------------------------------------------------

Any plans to work on this or any pointers how one would go about making the needed modification?  I'm working with a dataset that doesn't appear to be converging before it runs into this limitation...

> MLlib ALS gets stack overflow with too many iterations
> ------------------------------------------------------
>
>                 Key: SPARK-1006
>                 URL: https://issues.apache.org/jira/browse/SPARK-1006
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>            Reporter: Matei Zaharia
>
> The tipping point seems to be around 50. We should fix this by checkpointing the RDDs every 10-20 iterations to break the lineage chain, but checkpointing currently requires HDFS installed, which not all users will have.
> We might also be able to fix DAGScheduler to not be recursive.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org