Return-Path: X-Original-To: apmail-spark-user-archive@minotaur.apache.org Delivered-To: apmail-spark-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0443E17911 for ; Wed, 12 Nov 2014 10:15:00 +0000 (UTC) Received: (qmail 57865 invoked by uid 500); 12 Nov 2014 10:14:58 -0000 Delivered-To: apmail-spark-user-archive@spark.apache.org Received: (qmail 57797 invoked by uid 500); 12 Nov 2014 10:14:58 -0000 Mailing-List: contact user-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@spark.apache.org Received: (qmail 57787 invoked by uid 99); 12 Nov 2014 10:14:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Nov 2014 10:14:58 +0000 X-ASF-Spam-Status: No, hits=3.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.220.178] (HELO mail-vc0-f178.google.com) (209.85.220.178) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Nov 2014 10:14:31 +0000 Received: by mail-vc0-f178.google.com with SMTP id hq12so2735037vcb.37 for ; Wed, 12 Nov 2014 02:12:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=BFzXH/V44NUZHlxy8HnLCjt+im/39bCDcoqTiME+Bso=; b=gl1OgSOwxx+bwGXcxFfMQwUyA0unx9elpaaTxyVJQK4hGcl0ldtScm88DgtqSriOXo pFGpzFnv61VyUFAH0fhyH+VSMJFv/hV4UKphWp7h8pPIIj4yI14XmkkXb9mMz+H2LblH ThXJ0JHnpmZEVZeRhhXa+kpfXLlVbbw6/2FByw+gf19gMckirpW1eD7I1LGyGnIZoUKn 4oUevXFEkJggsgg9e8OYIKbWMr8ONfYINO87R5Zo6eqBa2zCtHyPQSrJbXKPXHnMBJTf uThnNbDwANZ3EP1Dm7DIPSQYi1mv1wBO47wN3zFo5fWusYZ0p26OXEY17lY+yvKX5T/f rIpQ== X-Gm-Message-State: ALoCoQmPME0CqdOyR2ONnKlhPG7bhLsJElY+m8T7xagSNCW9r6ccdR4fiMwSXRK4PTOi/X6OE5CB X-Received: by 10.52.118.8 with SMTP id ki8mr5696286vdb.85.1415787179516; Wed, 12 Nov 2014 02:12:59 -0800 (PST) Received: from mail-vc0-f177.google.com (mail-vc0-f177.google.com. [209.85.220.177]) by mx.google.com with ESMTPSA id ki8sm3287026vdb.16.2014.11.12.02.12.57 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 12 Nov 2014 02:12:57 -0800 (PST) Received: by mail-vc0-f177.google.com with SMTP id ij19so842998vcb.8 for ; Wed, 12 Nov 2014 02:12:57 -0800 (PST) X-Received: by 10.52.147.78 with SMTP id ti14mr5698581vdb.69.1415787177484; Wed, 12 Nov 2014 02:12:57 -0800 (PST) MIME-Version: 1.0 Received: by 10.220.84.129 with HTTP; Wed, 12 Nov 2014 02:12:37 -0800 (PST) In-Reply-To: References: <1397539827607-4261.post@n3.nabble.com> From: Andrew Ash Date: Wed, 12 Nov 2014 02:12:37 -0800 Message-ID: Subject: Re: Scala vs Python performance differences To: user , freeman.jeremy@gmail.com Cc: "user@spark.incubator.apache.org" Content-Type: multipart/alternative; boundary=bcaec51a77c4caa9990507a6a3c6 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec51a77c4caa9990507a6a3c6 Content-Type: text/plain; charset=UTF-8 Jeremy, Did you complete this benchmark in a way that's shareable with those interested here? Andrew On Tue, Apr 15, 2014 at 2:50 PM, Nicholas Chammas < nicholas.chammas@gmail.com> wrote: > I'd also be interested in seeing such a benchmark. > > > On Tue, Apr 15, 2014 at 9:25 AM, Ian Ferreira > wrote: > >> This would be super useful. Thanks. >> >> On 4/15/14, 1:30 AM, "Jeremy Freeman" wrote: >> >> >Hi Andrew, >> > >> >I'm putting together some benchmarks for PySpark vs Scala. I'm focusing >> on >> >ML algorithms, as I'm particularly curious about the relative performance >> >of >> >MLlib in Scala vs the Python MLlib API vs pure Python implementations. >> > >> >Will share real results as soon as I have them, but roughly, in our >> hands, >> >that 40% number is ballpark correct, at least for some basic operations >> >(e.g >> >textFile, count, reduce). >> > >> >-- Jeremy >> > >> >--------------------- >> >Jeremy Freeman, PhD >> >Neuroscientist >> >@thefreemanlab >> > >> > >> > >> >-- >> >View this message in context: >> > >> http://apache-spark-user-list.1001560.n3.nabble.com/Scala-vs-Python-perfor >> >mance-differences-tp4247p4261.html >> >Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> >> > --bcaec51a77c4caa9990507a6a3c6 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Jeremy,

Did you complete this benchmark= in a way that's shareable with those interested here?

Andrew

On Tue, Apr 15, 2014 at 2:50 PM, Nicholas Chammas <= ;nicholas.c= hammas@gmail.com> wrote:
I'd also be interested in seeing such a benchmark.
=


=
On Tue, Apr 15, 2014 at 9:25 AM, Ian Ferreira <ianferreira@hotmail.com> wrote:
This would be super useful. Thanks.

On 4/15/14, 1:30 AM, "Jeremy Freeman" <freeman.jeremy@gmail.com> wro= te:

>Hi Andrew,
>
>I'm putting together some benchmarks for PySpark vs Scala. I'm = focusing on
>ML algorithms, as I'm particularly curious about the relative perfo= rmance
>of
>MLlib in Scala vs the Python MLlib API vs pure Python implementations.<= br> >
>Will share real results as soon as I have them, but roughly, in our han= ds,
>that 40% number is ballpark correct, at least for some basic operations=
>(e.g
>textFile, count, reduce).
>
>-- Jeremy
>
>---------------------
>Jeremy Freeman, PhD
>Neuroscientist
>@thefreemanlab
>
>
>
>--
>View this message in context:
>http://apache-spark-user-list.1001560.n3.= nabble.com/Scala-vs-Python-perfor
>mance-differences-tp4247p4261.html
>Sent from the Apache Spark User List mailing list archive at Nabble.com= .




--bcaec51a77c4caa9990507a6a3c6--