hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adaryl \"Bob\" Wakefield, MBA" <adaryl.wakefi...@hotmail.com>
Subject Re: Spark vs Tez
Date Tue, 21 Oct 2014 04:16:06 GMT
Using an interpreted scripting language with something that is billing itself as being fast
doesn’t sound like the best idea...

From: Russell Jurney 
Sent: Saturday, October 18, 2014 7:38 AM
To: user@hadoop.apache.org 
Subject: Re: Spark vs Tez

Check out PySpark. No Scala required.

On Friday, October 17, 2014, Adaryl "Bob" Wakefield, MBA <adaryl.wakefield@hotmail.com>

  “The only problem with Spark adoption is the steep learning curve of Scala , and understanding
the API properly.” 

  This is why I’m looking for reasons to avoid Spark. In my mind, it’s one more thing
to have to master and doesn’t really have anything to offer that can’t be done with other
tools that are already inside my skillset. I spoke with some software engineers recently and
basically the discussion boiled down to if you need to master Java or Scala go with Java.
Three months into Java I don’t want to stop that and start learning Scala.

  From: kartik saxena 
  Sent: Friday, October 17, 2014 1:12 PM
  To: javascript:_e(%7B%7D,'cvml','user@hadoop.apache.org'); 
  Subject: Re: Spark vs Tez

  I did a performance benchmark during my summer internship . I am currently a grad student.
Can't reveal much about the specific project but Spark is still faster than around 4-5th iteration
of Tez of the same query/dataset. By Iteration I mean utilizing the "hot-container" property
of Apache Tez  . See latest release of Tez and some hortonworks tutorials on their website.

  The only problem with Spark adoption is the steep learning curve of Scala , and understanding
the API properly. 


  On Fri, Oct 17, 2014 at 11:06 AM, Adaryl "Bob" Wakefield, MBA <javascript:_e(%7B%7D,'cvml','adaryl.wakefield@hotmail.com');>

    Does anybody have any performance figures on how Spark stacks up against Tez? If you don’t
have figures, does anybody have an opinion? Spark seems so popular but I’m not really seeing

Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

View raw message