spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ulanov, Alexander" <alexander.ula...@hp.com>
Subject Re: Is breeze thread safe in Spark?
Date Wed, 03 Sep 2014 19:30:54 GMT
What about the allocation of a new breeze vector? Can it happen unsafe within Spark (in several
threads)?

Best regards, Alexander

03.09.2014, в 23:17, "Xiangrui Meng" <mengxr@gmail.com> написал(а):

> RJ, could you provide a code example that can re-produce the bug you
> observed in local testing? Breeze's += is not thread-safe. But in a
> Spark job, calls to a resultHandler is synchronized:
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala#L52
> . Let's move our discussion to the JIRA page. -Xiangrui
> 
> On Wed, Sep 3, 2014 at 12:07 PM, RJ Nowling <rnowling@gmail.com> wrote:
>> Here's the JIRA:
>> 
>> https://issues.apache.org/jira/browse/SPARK-3384
>> 
>> Even if the current implementation uses += in a thread safe manner, it can
>> be easy to make the mistake of accidentally using += in a parallelized
>> context.  I suggest changing all instances of += to +.
>> 
>> I would encourage others to reproduce and validate this issue, though.
>> 
>> 
>> On Wed, Sep 3, 2014 at 3:02 PM, David Hall <dlwh@cs.berkeley.edu> wrote:
>> 
>>> mutating operations are not thread safe. Operations that don't mutate
>>> should be thread safe. I can't speak to what Evan said, but I would guess
>>> that the way they're using += should be safe.
>>> 
>>> 
>>> On Wed, Sep 3, 2014 at 11:58 AM, RJ Nowling <rnowling@gmail.com> wrote:
>>> 
>>>> David,
>>>> 
>>>> Can you confirm that += is not thread safe but + is?  I'm assuming +
>>>> allocates a new object for the write, while += doesn't.
>>>> 
>>>> Thanks!
>>>> RJ
>>>> 
>>>> 
>>>> On Wed, Sep 3, 2014 at 2:50 PM, David Hall <dlwh@cs.berkeley.edu> wrote:
>>>> 
>>>>> In general, in Breeze we allocate separate work arrays for each call
to
>>>>> lapack, so it should be fine. In general concurrent modification isn't
>>>>> thread safe of course, but things that "ought" to be thread safe really
>>>>> should be.
>>>>> 
>>>>> 
>>>>> On Wed, Sep 3, 2014 at 10:41 AM, RJ Nowling <rnowling@gmail.com>
wrote:
>>>>> 
>>>>>> No, it's not in all cases.   Since Breeze uses lapack under the hood,
>>>>>> changes to memory between different threads is bad.
>>>>>> 
>>>>>> There's actually a potential bug in the KMeans code where it uses
+=
>>>>>> instead of +.
>>>>>> 
>>>>>> 
>>>>>> On Wed, Sep 3, 2014 at 1:26 PM, Ulanov, Alexander <
>>>>>> alexander.ulanov@hp.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Is breeze library called thread safe from Spark mllib code in
case
>>>>>> when
>>>>>>> native libs for blas and lapack are used? Might it be an issue
when
>>>>>> running
>>>>>>> Spark locally?
>>>>>>> 
>>>>>>> Best regards, Alexander
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> em rnowling@gmail.com
>>>>>> c 954.496.2314
>>>> 
>>>> 
>>>> --
>>>> em rnowling@gmail.com
>>>> c 954.496.2314
>> 
>> 
>> --
>> em rnowling@gmail.com
>> c 954.496.2314

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message