Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 067E3179DC for ; Fri, 10 Apr 2015 14:56:22 +0000 (UTC) Received: (qmail 36092 invoked by uid 500); 10 Apr 2015 14:56:15 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 35996 invoked by uid 500); 10 Apr 2015 14:56:15 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 35986 invoked by uid 99); 10 Apr 2015 14:56:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Apr 2015 14:56:15 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of moty@xplenty.com designates 209.85.220.170 as permitted sender) Received: from [209.85.220.170] (HELO mail-qk0-f170.google.com) (209.85.220.170) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Apr 2015 14:55:50 +0000 Received: by qkx62 with SMTP id 62so33116956qkx.0 for ; Fri, 10 Apr 2015 07:54:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=xplenty.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=KmYZtMwSe6r1pWjnGbyiKczJl1q28rt+EVxKaJfQycM=; b=jPsI8L6BM1tAn20/Wul5rKK8qef8ukJn5F4+lorul6y8hq5rCf9vV6RNeFV4kdOyjR NCVuFu3V3q1nyFq7HRKHw0VL0QWyGh/OB/gaIa9cRn3QfmstmnaCqpLlSQhpJHR53Dbn Gt4a1Ux3R0WKxctwawqXRfkcuZuS2b3lO/oas= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=KmYZtMwSe6r1pWjnGbyiKczJl1q28rt+EVxKaJfQycM=; b=WQB5yXkHjWYSjS1l0heBqnj9cJ/6haBAU2jknG3nWkGQPb/ZWaV6+4eIwgDIZVCCxV dLFw/6jfZUVrNKvnWctKVm0UNh7I4Jz8aZRYXm/uAAPM2iTHECKPtPYVPIXv9Ju3KVLd XQUg3gQxBhp9bZPTIDshv2mL/onVq9mECox3e+hQHlSDlbt4grmCp8W4PIXIb2olQi9t qDq1zgSrwkqDdTK23CzTU0l62nf/G/1r5MzoMeWJEtHCl0cruQVxnHKR+kS1cavK6aof J5XAYTu2q+TUQkakej6q5ixKjRJqJJsJEljRKPrpBfeMS+RvYJIJHyWrOUyR3sO02ZDL BYGQ== X-Gm-Message-State: ALoCoQk1plCJI2GmfVX7oU0BYyTtAqkZlmo21pmmqXSKy/71eyYs/9waVI9UhJ3AiPiwJ9udcXQE X-Received: by 10.229.10.196 with SMTP id q4mr2234529qcq.25.1428677658520; Fri, 10 Apr 2015 07:54:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.207.67 with HTTP; Fri, 10 Apr 2015 07:53:48 -0700 (PDT) In-Reply-To: References: From: Moty Michaely Date: Fri, 10 Apr 2015 17:53:48 +0300 Message-ID: Subject: Re: Hadoop or spark To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001a11c1ef4655d50305135ff01f X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1ef4655d50305135ff01f Content-Type: text/plain; charset=UTF-8 Hey, Xplenty's CTO wrote a good piece of comparison between the two: https://www.xplenty.com/blog/2014/11/apache-spark-vs-hadoop-mapreduce/?utm_source=hadoop-mailing-group&utm_medium=email&utm_campaign=social Hope this helps with deciding. Good luck! On Fri, Apr 10, 2015 at 4:28 PM, Shahab Yunus wrote: > Thanks for this. Slide# 77 and 87 are pretty good. Quite a few of it, is > new stuff and still emerging. > > Regards, > Shahab > > On Fri, Apr 10, 2015 at 9:10 AM, Peyman Mohajerian > wrote: > >> There actually is such a discussion, e.g.: >> >> http://www.slideshare.net/sbaltagi/spark-or-hadoop-is-it-an-eitheror-proposition-by-slim-baltagi >> >> you can have a standalone Spark cluster with no dependency on Hadoop. >> >> On Fri, Apr 10, 2015 at 5:47 AM, Shahab Yunus >> wrote: >> >>> I hope I am not misunderstanding your question but I don't think there >>> is a comparison between Spark and Hadoop. They are different things. >>> >>> Hadoop is a platform on which you can run Yarn, HBase and even Spark. >>> E.g. Cloudera's Hadoop distribution has Spark, Hbase, Impala, Pig etc. as >>> part of its installation. Spark can run within a Hadoop cluster deployment. >>> >>> I think a more apt comparison would be something like whether you should >>> use regular MapReduce on Yarn on Hadoop OR Spark on Hadoop. >>> >>> Or even more direct would be Spark vs. Storm, which has been discussed >>> here. >>> http://marc.info/?l=hadoop-user&m=140434265901449 >>> >>> Regards, >>> Shahab >>> >>> >>> >>> On Fri, Apr 10, 2015 at 1:08 AM, Ashutosh Kumar >>> wrote: >>> >>>> How do I decide whether I should go for Hadoop or Spark for a >>>> greenfield project . I tried to find out and looks like Spark can do >>>> everything that hadoop can do. Appreciate your thoughts on it. >>>> >>>> Thanks >>>> >>>> >>> >> > -- Moty Michaely VP R&D, Xplenty --001a11c1ef4655d50305135ff01f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hey,

Xplenty's CTO wrote a good pie= ce of comparison between the two:

Hope this helps with deciding.

Good luck!

On Fri, Apr 10, 2015 at 4:28 PM, Shahab Yunus <sh= ahab.yunus@gmail.com> wrote:
Thanks for this. Slide# 77 and 87 are pretty good. Quite= a few of it, =C2=A0is new stuff and still emerging.

Reg= ards,
Shahab

On Fri, Apr 10, 2015 at 9:10 AM, Peyman Mohajerian <m= ohajeri@gmail.com> wrote:
<= div dir=3D"ltr">There actually is such a discussion, e.g.:=

you can have a standalone Spark cluster with no depende= ncy on Hadoop.

On Fri, Apr 10, 2015 at 5:47 AM, Shahab Yunus <sh= ahab.yunus@gmail.com> wrote:
I hope I am not misunderstanding your question but I don= 't think there is a comparison between Spark and Hadoop. They are diffe= rent things.

Hadoop is a platform on which you can run Y= arn, HBase and even Spark. E.g. Cloudera's Hadoop distribution has Spar= k, Hbase, Impala, Pig etc. as part of its installation. Spark can run withi= n a Hadoop cluster deployment.

I think a more apt = comparison would be something like whether you should use regular MapReduce= on Yarn on Hadoop OR Spark on Hadoop.

Or even mor= e direct would be Spark vs. Storm, which has been discussed here.
http://marc.info/?l=3Dhadoop-user&m=3D140434265901449=

Regards,
Shahab

<= div>

On Fri, Apr 10, 2015 at 1:08 AM, Ashutosh Kumar <a= shutosh.k78@gmail.com> wrote:
How do I decide whether I should go for Hadoop or = Spark for a greenfield project . I tried to find out and looks like Spark c= an do everything that hadoop can do. Appreciate your thoughts on it.
Thanks







--
=

Moty Michaely

VP R= &D, Xplenty

--001a11c1ef4655d50305135ff01f--