hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmed Ossama <ah...@aossama.com>
Subject Re: How to run a mapreduce program not on the node of hadoop cluster?
Date Wed, 14 Jan 2015 15:05:28 GMT
The node that the project will be deployed on should have the same 
configuration as the cluster, and hadoop executables as well. It doesn't 
have to be one of the cluster nodes.

This node is typically called a gateway or edge node, where it have all 
the client programs (hadoop execs, pig, etc...) and you use this node to 
submit jobs to the cluster.

The executable use the configuration to know where to submit jobs and 
where is your hdfs nn located and so on.

On 01/14/2015 04:39 PM, Cao Yi wrote:
> The program will be used in product environment.
> Does you mean that the program must be deployed on any node of the 
> cluster?
> I have some experience in operating database, I can 
> query/edit/add/remove data on the OS witch the database installed on, 
> or operate from the other machine remotely. Can I use Hadoop remotely 
> as to use database in a similar way?
> Best Regards,
> Iridium
> On Wed, Jan 14, 2015 at 9:15 PM, unmesha sreeveni 
> <unmeshabiju@gmail.com <mailto:unmeshabiju@gmail.com>> wrote:
>     Your data wont get splitted. so your program runs as single mapper
>     and single reducer. And your intermediate data is not shuffeld and
>     sorted, But u can use this for debuging
>     On Jan 14, 2015 2:04 PM, "Cao Yi" <iridiumcao@gmail.com
>     <mailto:iridiumcao@gmail.com>> wrote:
>         Hi,
>         I write some mapreduce code in my project /my_prj/. /my_prj
>         /will be deployed on the machine which is not a node of the
>         cluster.
>         how does /my_prj/ to run a mapreduce job in this case?
>         thank you!
>         Best Regards,
>         Iridium

Ahmed Ossama

View raw message