hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harish Mallipeddi <harish.mallipe...@gmail.com>
Subject Re: how to use hadoop in real life?
Date Fri, 10 Jul 2009 05:32:02 GMT
Hi Shravan,
By Hadoop client, I think he means the "hadoop" command-line program
available under $HADOOP_HOME/bin. You can either write a custom Java program
which directly uses the Hadoop APIs or just write a bash/python script which
will invoke this command-line app and delegate work to it.

- Harish

On Fri, Jul 10, 2009 at 10:41 AM, Shravan Mahankali <
shravan.mahankali@catalytic.com> wrote:

> Hi Alex/ Group,
>
>
>
> Thanks for your response. Is there something called "Hadoop client"? Google
> does not suggest me one!
>
>
>
> Should this Hadoop client/ Hadoop be installed, configured as we did with
> Hadoop on a server? So, will this Hadoop client occupies memory/ disk space
> for running data/ name nodes, slaves.
>
>
>
> Thank You,
>
> Shravan Kumar. M
>
> Catalytic Software Ltd. [SEI-CMMI Level 5 Company]
>
> -----------------------------
>
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the system
> administrator -  <mailto:netopshelpdesk@catalytic.com>
> netopshelpdesk@catalytic.com
>
>  _____
>
> From: Alex Loddengaard [mailto:alex@cloudera.com]
> Sent: Thursday, July 09, 2009 11:19 PM
> To: shravan.mahankali@catalytic.com
> Cc: common-user@hadoop.apache.org
> Subject: Re: how to use hadoop in real life?
>
>
>
> Writing a Java program that uses the API is basically equivalent to
> installed a Hadoop client and writing a Python script to manipulate HDFS
> and
> fire off a MR job.  It's up to you to decide how much you like Java :).
>
> Alex
>
> On Thu, Jul 9, 2009 at 2:27 AM, Shravan Mahankali
> <shravan.mahankali@catalytic.com> wrote:
>
> Hi Group,
>
> I have data to be analyzed and I would like to dump this data to Hadoop
> from
> machine.X where as Hadoop is running from machine.Y, after dumping this
> data
> to data I would like to initiate a job, get this data analyzed and get the
> output information back to machine.X
>
> I would like to do all this programmatically. Am going through Hadoop API
> for this same purpose. I remember last day Alex was saying to install
> Hadoop
> in machine.X, but I was not sure why to do that?
>
> I simple write a Java program including Hadoop-core jar, I was planning to
> use "FsUrlStreamHandlerFactory" to connect to Hadoop in machine.Y and then
> use "org.apache.hadoop.fs.shell" to copy data to Hadoop machine and
> initiate
> the job and get the results.
>
> Please advice.
>
> Thank You,
>
> Shravan Kumar. M
> Catalytic Software Ltd. [SEI-CMMI Level 5 Company]
> -----------------------------
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the system
> administrator - netopshelpdesk@catalytic.com
>
> -----Original Message-----
>
> From: Shravan Mahankali [mailto:shravan.mahankali@catalytic.com]
> Sent: Thursday, July 09, 2009 10:35 AM
> To: common-user@hadoop.apache.org
>
> Cc: 'Alex Loddengaard'
> Subject: RE: how to use hadoop in real life?
>
> Thanks for the information Ted.
>
> Regards,
> Shravan Kumar. M
> Catalytic Software Ltd. [SEI-CMMI Level 5 Company]
> -----------------------------
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the system
> administrator - netopshelpdesk@catalytic.com
>
> -----Original Message-----
> From: Ted Dunning [mailto:ted.dunning@gmail.com]
> Sent: Wednesday, July 08, 2009 10:48 PM
> To: common-user@hadoop.apache.org; shravan.mahankali@catalytic.com
> Cc: Alex Loddengaard
> Subject: Re: how to use hadoop in real life?
>
> In general hadoop is simpler than you might imagine.
>
> Yes, you need to create directories to store data.  This is much lighter
> weight than creating a table in SQL.
>
> But the key question is volume.  Hadoop makes some things easier and Pig
> queries are generally easier to write than SQL (for programmers ... not for
> those raised on SQL), but, overall, map-reduce programs really are more
> work
> to write than SQL queries until you get to really large scale problems.
>
> If your database has less than 10 million rows or so, I would recommend
> that
> you consider doing all analysis in SQL augmented by procedural languages.
> Only as your data goes beyond 100 million to a billion rows do the clear
> advantages of map-reduce formulation become apparent.
>
> On Tue, Jul 7, 2009 at 11:35 PM, Shravan Mahankali <
> shravan.mahankali@catalytic.com> wrote:
>
> > Use Case: We have a web app where user performs some actions, we have to
> > track these actions and various parameters related to action initiator,
> we
> > actually store this information in the database. But our manager has
> > suggested evaluating Hadoop for this scenario, however, am not clear that
> > every time I run a job in Hadoop I have to create a directory and how can
> I
> > track that later to read the data analyzed by Hadoop. Even though I drop
> > user action information in Hadoop, I have to put this information in our
> > database such that it knows the trend and responds for various of
> requests
> > accordingy.
> >
>
>
>
>


-- 
Harish Mallipeddi
http://blog.poundbang.in

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message