spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reynold Xin <r...@databricks.com>
Subject Re: If you use Spark 1.5 and disabled Tungsten mode ...
Date Tue, 20 Oct 2015 21:27:33 GMT
Jerry - I think that's been fixed in 1.5.1. Do you still see it?

On Tue, Oct 20, 2015 at 2:11 PM, Jerry Lam <chilinglam@gmail.com> wrote:

> I disabled it because of the "Could not acquire 65536 bytes of memory". It
> happens to fail the job. So for now, I'm not touching it.
>
> On Tue, Oct 20, 2015 at 4:48 PM, charmee <charmeep@gmail.com> wrote:
>
>> We had disabled tungsten after we found few performance issues, but had to
>> enable it back because we found that when we had large number of group by
>> fields, if tungsten is disabled the shuffle keeps failing.
>>
>> Here is an excerpt from one of our engineers with his analysis.
>>
>> With Tungsten Enabled (default in spark 1.5):
>> ~90 files of 0.5G each:
>>
>> Ingest (after applying broadcast lookups) : 54 min
>> Aggregation (~30 fields in group by and another 40 in aggregation) : 18
>> min
>>
>> With Tungsten Disabled:
>>
>> Ingest : 30 min
>> Aggregation : Erroring out
>>
>> On smaller tests we found that joins are slow with tungsten enabled. With
>> GROUP BY, disabling tungsten is not working in the first place.
>>
>> Hope this helps.
>>
>> -Charmee
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/If-you-use-Spark-1-5-and-disabled-Tungsten-mode-tp14604p14711.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>

Mime
View raw message