spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Pentreath <nick.pentre...@gmail.com>
Subject Re: Running ALS on comparitively large RDD
Date Fri, 11 Mar 2016 07:16:24 GMT
Could you provide more details about:
1. Data set size (# ratings, # users and # products)
2. Spark cluster set up and version

Thanks

On Fri, 11 Mar 2016 at 05:53 Deepak Gopalakrishnan <dgkris@gmail.com> wrote:

> Hello All,
>
> I've been running Spark's ALS on a dataset of users and rated items. I
> first encode my users to integers by using an auto increment function (
> just like zipWithIndex), I do the same for my items. I then create an RDD
> of the ratings and feed it to ALS.
>
> My issue is that the ALS algorithm never completes. Attached is a
> screenshot of the stages window.
>
> Any help will be greatly appreciated
>
> --
> Regards,
> *Deepak Gopalakrishnan*
> *Mobile*:+918891509774
> *Skype* : deepakgk87
> http://myexps.blogspot.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org

Mime
View raw message