cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Lovely ...@aegisco.com>
Subject Re: Spark Memory Error - Not enough space to cache broadcast
Date Thu, 16 Jun 2016 21:25:51 GMT
I believe you want to set memoryFraction higher, not lower.  These two
older threads seem to have similar issues you are experiencing:

https://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/%3CCAHUQ+_ZqaWFs_MJ=+V49bD2paKvjLErPKMEW5duLO1jAo4=d1A@mail.gmail.com%3E
https://www.mail-archive.com/user@spark.apache.org/msg44793.html

More info on tuning shuffle behavior:
https://spark.apache.org/docs/1.5.1/configuration.html#shuffle-behavior

On Thu, Jun 16, 2016 at 1:57 PM, Cassa L <lcassa8@gmail.com> wrote:

> Hi Dennis,
>
> On Wed, Jun 15, 2016 at 11:39 PM, Dennis Lovely <dl@aegisco.com> wrote:
>
>> You could try tuning spark.shuffle.memoryFraction and
>> spark.storage.memoryFraction (both of which have been deprecated in 1.6),
>> but ultimately you need to find out where you are bottlenecked and address
>> that as adjusting memoryFraction will only be a stopgap.  both shuffle and
>> storage memoryFractions default to 0.6
>>
>> I have set above parameters to 0.5. Does it need to increased?
>
> Thanks.
>
>> On Wed, Jun 15, 2016 at 9:37 PM, Cassa L <lcassa8@gmail.com> wrote:
>>
>>> Hi,
>>>  I did set  --driver-memory 4G. I still run into this issue after 1
>>> hour of data load.
>>>
>>> I also tried version 1.6 in test environment. I hit this issue much
>>> faster than in 1.5.1 setup.
>>> LCassa
>>>
>>> On Tue, Jun 14, 2016 at 3:57 PM, Gaurav Bhatnagar <gauravkb@gmail.com>
>>> wrote:
>>>
>>>> try setting the option --driver-memory 4G
>>>>
>>>> On Tue, Jun 14, 2016 at 3:52 PM, Ben Slater <ben.slater@instaclustr.com
>>>> > wrote:
>>>>
>>>>> A high level shot in the dark but in our testing we found Spark 1.6 a
>>>>> lot more reliable in low memory situations (presumably due to
>>>>> https://issues.apache.org/jira/browse/SPARK-10000). If it’s an
>>>>> option, probably worth a try.
>>>>>
>>>>> Cheers
>>>>> Ben
>>>>>
>>>>> On Wed, 15 Jun 2016 at 08:48 Cassa L <lcassa8@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>> I would appreciate any clue on this. It has become a bottleneck for
>>>>>> our spark job.
>>>>>>
>>>>>> On Mon, Jun 13, 2016 at 2:56 PM, Cassa L <lcassa8@gmail.com>
wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm using spark 1.5.1 version. I am reading data from Kafka into
Spark and writing it into Cassandra after processing it. Spark job starts fine and runs all
good for some time until I start getting below errors. Once these errors come, job start to
lag behind and I see that job has scheduling and processing delays in streaming  UI.
>>>>>>>
>>>>>>> Worker memory is 6GB, executor-memory is 5GB, I also tried to
tweak memoryFraction parameters. Nothing works.
>>>>>>>
>>>>>>>
>>>>>>> 16/06/13 21:26:02 INFO MemoryStore: ensureFreeSpace(4044) called
with curMem=565394, maxMem=2778495713
>>>>>>> 16/06/13 21:26:02 INFO MemoryStore: Block broadcast_69652_piece0
stored as bytes in memory (estimated size 3.9 KB, free 2.6 GB)
>>>>>>> 16/06/13 21:26:02 INFO TorrentBroadcast: Reading broadcast variable
69652 took 2 ms
>>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Failed to reserve initial
memory threshold of 1024.0 KB for computing block broadcast_69652 in memory.
>>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Not enough space to cache
broadcast_69652 in memory! (computed 496.0 B so far)
>>>>>>> 16/06/13 21:26:02 INFO MemoryStore: Memory use = 556.1 KB (blocks)
+ 2.6 GB (scratch space shared across 0 tasks(s)) = 2.6 GB. Storage limit = 2.6 GB.
>>>>>>> 16/06/13 21:26:02 WARN MemoryStore: Persisting block broadcast_69652
to disk instead.
>>>>>>> 16/06/13 21:26:02 INFO BlockManager: Found block rdd_100761_1
locally
>>>>>>> 16/06/13 21:26:02 INFO Executor: Finished task 0.0 in stage 71577.0
(TID 452316). 2043 bytes result sent to driver
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> L
>>>>>>>
>>>>>>>
>>>>>> --
>>>>> ————————
>>>>> Ben Slater
>>>>> Chief Product Officer
>>>>> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
>>>>> +61 437 929 798
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message