hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gavin Yue <yue.yuany...@gmail.com>
Subject Re: Spark vs Tez
Date Fri, 17 Oct 2014 18:29:58 GMT
Spark and tez both make MR faster, this has no doubt.

They also provide new features like DAG, which is quite important for
interactive query processing.  From this perspective, you could view them
as a wrapper around MR and try to handle the intermediary buffer(files)
more efficiently.  It is a big pain in MR.

Also they both try to use Memory as the buffer instead of only
filesystems.   Spark has a concept RDD, which is quite interesting and also
limited.



On Fri, Oct 17, 2014 at 11:23 AM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   It was my understanding that Spark is faster batch processing. Tez is
> the new execution engine that replaces MapReduce and is also supposed to
> speed up batch processing. Is that not correct?
> B.
>
>
>
>  *From:* Shahab Yunus <shahab.yunus@gmail.com>
> *Sent:* Friday, October 17, 2014 1:12 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Spark vs Tez
>
>  What aspects of Tez and Spark are you comparing? They have different
> purposes and thus not directly comparable, as far as I understand.
>
> Regards,
> Shahab
>
> On Fri, Oct 17, 2014 at 2:06 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   Does anybody have any performance figures on how Spark stacks up
>> against Tez? If you don’t have figures, does anybody have an opinion? Spark
>> seems so popular but I’m not really seeing why.
>> B.
>>
>
>

Mime
View raw message