Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6B84CF50C for ; Sun, 14 Apr 2013 05:19:11 +0000 (UTC) Received: (qmail 18882 invoked by uid 500); 14 Apr 2013 05:19:06 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 18795 invoked by uid 500); 14 Apr 2013 05:19:06 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 18763 invoked by uid 99); 14 Apr 2013 05:19:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Apr 2013 05:19:05 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [195.29.150.135] (HELO ls405.t-com.hr) (195.29.150.135) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Apr 2013 05:18:58 +0000 Received: from ls265.t-com.hr (ls265.t-com.hr [195.29.150.93]) by ls405.t-com.hr (Postfix) with ESMTP id 0D213698559 for ; Sun, 14 Apr 2013 07:18:36 +0200 (CEST) Received: from ls265.t-com.hr (localhost.localdomain [127.0.0.1]) by ls265.t-com.hr (Qmlai) with ESMTP id 011912110267 for ; Sun, 14 Apr 2013 07:18:36 +0200 (CEST) X-Envelope-Sender: vjeran.marcinko@email.t-com.hr Received: from ButterflyBoy (83-131-224-209.adsl.net.t-com.hr [83.131.224.209]) by ls265.t-com.hr (Qmali) with ESMTP id 422CE20B0244 for ; Sun, 14 Apr 2013 07:18:35 +0200 (CEST) From: "Vjeran Marcinko" To: Subject: Best Hadoop dev environment [WAS: RE: Few noob MR questions] Date: Sun, 14 Apr 2013 07:18:35 +0200 Message-ID: <001801ce38cf$83556300$8a002900$@email.t-com.hr> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0019_01CE38E0.46E58600" X-Mailer: Microsoft Outlook 14.0 Thread-Index: Ac44ze3bAhbmX+xETpym98QSKsu9QQ== Content-Language: hr X-TM-AS-Product-Ver: IMSS-7.1.0.1224-7.0.0.1014-19796.004 X-TM-AS-Result: No--19.859-10.0-31-1 X-imss-scan-details: No--19.859-10.0-31-1 X-TM-AS-User-Approved-Sender: No X-Virus-Checked: Checked by ClamAV on apache.org This is a multipart message in MIME format. ------=_NextPart_000_0019_01CE38E0.46E58600 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi again, You actually touched what I'm trying to do here - setup best Hadooop development environment. Moreoever, don't ask me why, my development machine is on Windows, so I don't have my Hadoop on it, so I use linux virtual machine with Hadoop running in it, so I would like mostly to develop my job code in my favourite IDE, and just deploy my jobs from there, and let them see running in this "remote" virtual Hadoop platform. Although build scripts can help a lot, so each time I change some job code, using these scripts I could package it and transfer to Hadoop machine where I can deploy it via "hadoop jar." command, and I will certainly do that *in production*, but *in development environment* I would like to avoid that, And when in IDE, when I say "Run", it uses "java -classpath .", not even "java -jar .", so job class is not found in some packaged form. (at least by default - any proper IDE can add additional build steps to it), So are there any more hints for me to setup this environment? Hadoop can really be intimidating for newbvie - there so much versions out there, so many examples using different APIs, and so many ways to deploy a job for eg, that I don't know how to start. And my windows OS brings even more problems in the beginning, when I don't know much. Regards, Vjeran From: Bjorn Jonsson [mailto:bjornjon@gmail.com] Sent: Sunday, April 14, 2013 5:27 AM To: user@hadoop.apache.org Subject: Re: Few noob MR questions Correct, you can use java -jar to submit a job...with the "driver" code in a plain static main method. I do it all the time. You can of course run a Job straight from your IDE Java code also. You can check out the .runJar() method in the Hadoop API Javadoc to see what the hadoop command does essentially I think. Cheers, Bj On Sat, Apr 13, 2013 at 3:59 PM, Jens Scheidtmann wrote: Dear Vjeran, your own jobs should implement the Tool Interface and ToolRunner. This gives additional standard options on the command line. Also have a look at class ProgramDriver as used here: https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-projec t/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/Example Driver.java which further simplifies executing your MR jobs. Best regards, Jens ------=_NextPart_000_0019_01CE38E0.46E58600 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi again,

 

You actually touched what I'm trying to do here – setup best = Hadooop development environment.

 

Moreoever, don't ask me why, my development machine is on Windows, so I = don't have my Hadoop on it, so I use linux virtual machine with Hadoop = running in it, so I would like mostly to develop my job code in my = favourite IDE, and just deploy my jobs from there, and let them see = running in this „remote“ virtual Hadoop platform. Although = build scripts can help a lot, so each time I change some job code, using = these scripts I could package it and transfer to Hadoop machine where I = can deploy it via „hadoop jar…“ command, and I will = certainly do that *in production*, but *in development = environment* I would like to avoid that, And when in IDE, when I say = „Run“, it uses „java –classpath …“, = not even „java –jar …“, so job class is not = found in some packaged form. (at least by default – any proper IDE = can add additional build steps to it),

 

So are there any more hints for me to setup this = environment?

 

Hadoop can really be intimidating for newbvie – there so much = versions out there, so many examples using different APIs, and so many = ways to deploy a job for eg, that I don't know how to start. And my = windows OS brings even more problems in the beginning, when I don't know = much.

 

Regards,

Vjeran

 

From:= Bjorn = Jonsson [mailto:bjornjon@gmail.com]
Sent: Sunday, April 14, = 2013 5:27 AM
To: user@hadoop.apache.org
Subject: Re: = Few noob MR questions

 

Correct, you can use java -jar to submit a job...with = the "driver" code in a plain static main method. I do it all = the time. You can of course run a Job straight from your IDE Java = code also. You can check out the .runJar() method in the Hadoop API = Javadoc to see what the hadoop command does essentially I = think. 

 

Cheers,

Bj

 

On Sat, Apr 13, 2013 at 3:59 PM, Jens Scheidtmann = <jens.scheidtmann@gmail.com> = wrote:

Dear Vjeran,

your own jobs should = implement the Tool Interface and ToolRunner. This gives additional = standard options on the command line.

Also have a look at = class ProgramDriver as used here: https://svn.apache.org/repos/asf/hadoop/common/trunk/ha= doop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache= /hadoop/examples/ExampleDriver.java

which further simplifies executing your MR = jobs.

 

Best = regards,

Jens

 

------=_NextPart_000_0019_01CE38E0.46E58600--