hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Basjes <Ni...@basjes.nl>
Subject Re: Spark vs Tez
Date Sat, 18 Oct 2014 08:22:05 GMT
It is my understanding that one of the big differences between Tez and
Spark is is that a Tez based query still has the startup overhead of
starting JVMs on the Yarn cluster. Spark based queries are immediately
executed on "already running JVMs".

So for interactive dashboards Spark seems more suitable.

Did I understand correctly?

Niels Basjes
On Oct 17, 2014 8:30 PM, "Gavin Yue" <yue.yuanyuan@gmail.com> wrote:

> Spark and tez both make MR faster, this has no doubt.
>
> They also provide new features like DAG, which is quite important for
> interactive query processing.  From this perspective, you could view them
> as a wrapper around MR and try to handle the intermediary buffer(files)
> more efficiently.  It is a big pain in MR.
>
> Also they both try to use Memory as the buffer instead of only
> filesystems.   Spark has a concept RDD, which is quite interesting and also
> limited.
>
>
>
> On Fri, Oct 17, 2014 at 11:23 AM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   It was my understanding that Spark is faster batch processing. Tez is
>> the new execution engine that replaces MapReduce and is also supposed to
>> speed up batch processing. Is that not correct?
>> B.
>>
>>
>>
>>  *From:* Shahab Yunus <shahab.yunus@gmail.com>
>> *Sent:* Friday, October 17, 2014 1:12 PM
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: Spark vs Tez
>>
>>  What aspects of Tez and Spark are you comparing? They have different
>> purposes and thus not directly comparable, as far as I understand.
>>
>> Regards,
>> Shahab
>>
>> On Fri, Oct 17, 2014 at 2:06 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   Does anybody have any performance figures on how Spark stacks up
>>> against Tez? If you don’t have figures, does anybody have an opinion? Spark
>>> seems so popular but I’m not really seeing why.
>>> B.
>>>
>>
>>
>
>

Mime
View raw message