Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 41484 invoked from network); 20 May 2009 00:17:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 May 2009 00:17:39 -0000 Received: (qmail 76261 invoked by uid 500); 20 May 2009 00:17:49 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 76179 invoked by uid 500); 20 May 2009 00:17:49 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 76169 invoked by uid 99); 20 May 2009 00:17:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 May 2009 00:17:49 +0000 X-ASF-Spam-Status: No, hits=3.4 required=10.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.44.29] (HELO yx-out-2324.google.com) (74.125.44.29) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 May 2009 00:17:38 +0000 Received: by yx-out-2324.google.com with SMTP id 8so81541yxm.29 for ; Tue, 19 May 2009 17:17:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.215.12 with SMTP id n12mr1192200ang.154.1242778637089; Tue, 19 May 2009 17:17:17 -0700 (PDT) In-Reply-To: <24b53fa00905191035w41b115c1q94502ee82be4393b@mail.gmail.com> References: <24b53fa00905190836v9eff4e9gbbc3bd0cc483a55e@mail.gmail.com> <623d9cf40905190948i73b159e6uceef2938a1180ba0@mail.gmail.com> <24b53fa00905191035w41b115c1q94502ee82be4393b@mail.gmail.com> Date: Tue, 19 May 2009 17:17:16 -0700 Message-ID: <623d9cf40905191717y3f7796edhb4d5ee3101da2442@mail.gmail.com> Subject: Re: Hadoop & Python From: Alex Loddengaard To: core-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001636988b5c32c70d046a4cf359 X-Virus-Checked: Checked by ClamAV on apache.org --001636988b5c32c70d046a4cf359 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit You might also check out Dumbo, which is a Hadoop Python module. Alex On Tue, May 19, 2009 at 10:35 AM, s d wrote: > Thanks. > So in the overall scheme of things, what is the general feeling about using > python for this? I like the ease of deploying and reading python compared > with Java but want to make sure using python over hadoop is scalable & is > standard practice and not something done only for prototyping and small > scale tests. > > > On Tue, May 19, 2009 at 9:48 AM, Alex Loddengaard > wrote: > > > Streaming is slightly slower than native Java jobs. Otherwise Python > works > > great in streaming. > > > > Alex > > > > On Tue, May 19, 2009 at 8:36 AM, s d wrote: > > > > > Hi, > > > How robust is using hadoop with python over the streaming protocol? Any > > > disadvantages (performance? flexibility?) ? It just strikes me that > > python > > > is so much more convenient when it comes to deploying and crunching > text > > > files. > > > Thanks, > > > > > > --001636988b5c32c70d046a4cf359--