Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 32751 invoked from network); 10 Jul 2009 05:32:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Jul 2009 05:32:51 -0000 Received: (qmail 68625 invoked by uid 500); 10 Jul 2009 05:32:59 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 68536 invoked by uid 500); 10 Jul 2009 05:32:58 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 68526 invoked by uid 99); 10 Jul 2009 05:32:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jul 2009 05:32:58 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harish.mallipeddi@gmail.com designates 209.85.222.196 as permitted sender) Received: from [209.85.222.196] (HELO mail-pz0-f196.google.com) (209.85.222.196) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jul 2009 05:32:47 +0000 Received: by pzk34 with SMTP id 34so460229pzk.5 for ; Thu, 09 Jul 2009 22:32:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type; bh=aqz4bJ4IVPLX62KABMdX/7/a42n27sa0XkSfyASlBVI=; b=fENjKJG788nA+gilBlSjEHrjodUHOVC/YBc0rE1dTp3V4DnD37KkEr8nVKRTWjdMH5 VRNDxHLA7neupysdX3rmW+RbdwY2Wp9SRhOiwHdmUsLnO5Hf3I77s1Vvx24bSOXOZMzy nqHHV0e7lT/97uCznjcgG8t7AQMVcYc5yw+Zc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=uCiXqr3bLYYPZhBPp591d7J3Ejq27s7TggjkHZ8woYfEuml8n40LqPEPqqLK390Ac7 7ESJHLRUnhSesgFY93ErbC2BBv5jIdXIZ/XAfLjxJnM++kDbiFaLOZ6f9M5FJBPPw6R5 kyG2G7u8vVKt3KQf8YH8nYm399gRmAXJ4Fn+U= MIME-Version: 1.0 Received: by 10.142.84.3 with SMTP id h3mr554511wfb.120.1247203942088; Thu, 09 Jul 2009 22:32:22 -0700 (PDT) In-Reply-To: <20090710051223.390C57248D2@athena.apache.org> References: <623d9cf40907091049l7d1521cj73cf0958ded70879@mail.gmail.com> <20090710051223.390C57248D2@athena.apache.org> From: Harish Mallipeddi Date: Fri, 10 Jul 2009 11:02:02 +0530 Message-ID: Subject: Re: how to use hadoop in real life? To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001636e911aeee4227046e534b14 X-Virus-Checked: Checked by ClamAV on apache.org --001636e911aeee4227046e534b14 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hi Shravan, By Hadoop client, I think he means the "hadoop" command-line program available under $HADOOP_HOME/bin. You can either write a custom Java program which directly uses the Hadoop APIs or just write a bash/python script which will invoke this command-line app and delegate work to it. - Harish On Fri, Jul 10, 2009 at 10:41 AM, Shravan Mahankali < shravan.mahankali@catalytic.com> wrote: > Hi Alex/ Group, > > > > Thanks for your response. Is there something called "Hadoop client"? Google > does not suggest me one! > > > > Should this Hadoop client/ Hadoop be installed, configured as we did with > Hadoop on a server? So, will this Hadoop client occupies memory/ disk space > for running data/ name nodes, slaves. > > > > Thank You, > > Shravan Kumar. M > > Catalytic Software Ltd. [SEI-CMMI Level 5 Company] > > ----------------------------- > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you have received this email in error please notify the system > administrator - > netopshelpdesk@catalytic.com > > _____ > > From: Alex Loddengaard [mailto:alex@cloudera.com] > Sent: Thursday, July 09, 2009 11:19 PM > To: shravan.mahankali@catalytic.com > Cc: common-user@hadoop.apache.org > Subject: Re: how to use hadoop in real life? > > > > Writing a Java program that uses the API is basically equivalent to > installed a Hadoop client and writing a Python script to manipulate HDFS > and > fire off a MR job. It's up to you to decide how much you like Java :). > > Alex > > On Thu, Jul 9, 2009 at 2:27 AM, Shravan Mahankali > wrote: > > Hi Group, > > I have data to be analyzed and I would like to dump this data to Hadoop > from > machine.X where as Hadoop is running from machine.Y, after dumping this > data > to data I would like to initiate a job, get this data analyzed and get the > output information back to machine.X > > I would like to do all this programmatically. Am going through Hadoop API > for this same purpose. I remember last day Alex was saying to install > Hadoop > in machine.X, but I was not sure why to do that? > > I simple write a Java program including Hadoop-core jar, I was planning to > use "FsUrlStreamHandlerFactory" to connect to Hadoop in machine.Y and then > use "org.apache.hadoop.fs.shell" to copy data to Hadoop machine and > initiate > the job and get the results. > > Please advice. > > Thank You, > > Shravan Kumar. M > Catalytic Software Ltd. [SEI-CMMI Level 5 Company] > ----------------------------- > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you have received this email in error please notify the system > administrator - netopshelpdesk@catalytic.com > > -----Original Message----- > > From: Shravan Mahankali [mailto:shravan.mahankali@catalytic.com] > Sent: Thursday, July 09, 2009 10:35 AM > To: common-user@hadoop.apache.org > > Cc: 'Alex Loddengaard' > Subject: RE: how to use hadoop in real life? > > Thanks for the information Ted. > > Regards, > Shravan Kumar. M > Catalytic Software Ltd. [SEI-CMMI Level 5 Company] > ----------------------------- > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you have received this email in error please notify the system > administrator - netopshelpdesk@catalytic.com > > -----Original Message----- > From: Ted Dunning [mailto:ted.dunning@gmail.com] > Sent: Wednesday, July 08, 2009 10:48 PM > To: common-user@hadoop.apache.org; shravan.mahankali@catalytic.com > Cc: Alex Loddengaard > Subject: Re: how to use hadoop in real life? > > In general hadoop is simpler than you might imagine. > > Yes, you need to create directories to store data. This is much lighter > weight than creating a table in SQL. > > But the key question is volume. Hadoop makes some things easier and Pig > queries are generally easier to write than SQL (for programmers ... not for > those raised on SQL), but, overall, map-reduce programs really are more > work > to write than SQL queries until you get to really large scale problems. > > If your database has less than 10 million rows or so, I would recommend > that > you consider doing all analysis in SQL augmented by procedural languages. > Only as your data goes beyond 100 million to a billion rows do the clear > advantages of map-reduce formulation become apparent. > > On Tue, Jul 7, 2009 at 11:35 PM, Shravan Mahankali < > shravan.mahankali@catalytic.com> wrote: > > > Use Case: We have a web app where user performs some actions, we have to > > track these actions and various parameters related to action initiator, > we > > actually store this information in the database. But our manager has > > suggested evaluating Hadoop for this scenario, however, am not clear that > > every time I run a job in Hadoop I have to create a directory and how can > I > > track that later to read the data analyzed by Hadoop. Even though I drop > > user action information in Hadoop, I have to put this information in our > > database such that it knows the trend and responds for various of > requests > > accordingy. > > > > > > -- Harish Mallipeddi http://blog.poundbang.in --001636e911aeee4227046e534b14--