spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: Spark fpg large basket
Date Wed, 11 Mar 2015 07:10:23 GMT
You need to set spark.cores.max to a number say 16, so that on all 4
machines the tasks will get distributed evenly, Another thing would be to
set spark.default.parallelism if you haven't tried already.

Thanks
Best Regards

On Wed, Mar 11, 2015 at 12:27 PM, Sean Barzilay <sesnbarzilay@gmail.com>
wrote:

> I am running on a 4 workers cluster each having between 16 to 30 cores and
> 50 GB of ram
>
> On Wed, 11 Mar 2015 8:55 am Akhil Das <akhil@sigmoidanalytics.com> wrote:
>
>> Depending on your cluster setup (cores, memory), you need to specify the
>> parallelism/repartition the data.
>>
>> Thanks
>> Best Regards
>>
>> On Wed, Mar 11, 2015 at 12:18 PM, Sean Barzilay <sesnbarzilay@gmail.com>
>> wrote:
>>
>>> Hi I am currently using spark 1.3.0-snapshot to run the fpg algorithm
>>> from the mllib library. When I am trying to run the algorithm over a large
>>> basket(over 1000 items) the program seems to never finish. Did anyone find
>>> a workaround for this problem?
>>>
>>
>>

Mime
View raw message