hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <ar...@yahoo-inc.com>
Subject Re: Jar file location
Date Mon, 07 Jan 2008 20:04:04 GMT
On Mon, Jan 07, 2008 at 11:50:09AM -0800, Lars George wrote:
>Arun,
>
>Ah yes, I see it now in JobClient. OK, then how are the required aux 
>libs handled? I assume a /lib inside the job jar is the only way to go?
>

One option is to use the DistributedCache to distributed your job-specific jars, and the DistributedCache.add{Archive|File}ToClassPath
apis to add them to the *classpath* of the job.

http://lucene.apache.org/hadoop/docs/r0.15.1/mapred_tutorial.html#DistributedCache
http://lucene.apache.org/hadoop/docs/r0.15.1/api/org/apache/hadoop/filecache/DistributedCache.html


I'd doubt if you care, but http://issues.apache.org/jira/browse/HADOOP-1660 helps too - but
that is coming only in 0.16.0.

Arun

>I saw the discussion on the Wiki about adding Hbase permanently to the 
>HADOOP_CLASSPATH, but then I also have to deploy the Lucene jar files, 
>Xerces etc. I guess it is better if I add everything non-Hadoop into the 
>job jar's lib directory?
>
>Thanks again for the help,
>Lars
>
>
>Arun C Murthy wrote:
>>On Mon, Jan 07, 2008 at 08:24:36AM -0800, Lars George wrote:
>>  
>>>Hi,
>>>
>>>Maybe someone here can help me with a rather noob question. Where do I 
>>>have to put my custom jar to run it as a map/reduce job? Anywhere and 
>>>then specifying the HADOOP_CLASSPATH variable in hadoop-env.sh?
>>>
>>>    
>>
>>Once you have your jar and submit it for your job via the *hadoop jar* 
>>command the framework takes care of distributing the software for nodes on 
>>which your maps/reduces are scheduled:
>>$ hadoop jar <custom_jar> <custom_args> 
>>
>>The detail is that the framework copies your jar from the submission node 
>>to the HDFS and then copies it onto the execution node.
>>
>>Does 
>>http://lucene.apache.org/hadoop/docs/r0.15.1/mapred_tutorial.html#Usage 
>>help?
>>
>>Arun
>>
>>  
>>>Also, since I am using the Hadoop API already from our server code, it 
>>>seems natural to launch jobs from within our code. Are there any issue 
>>>with that? I assume I have to copy the jar files first and make them 
>>>available as per my question above, but then I am ready to start it from 
>>>my own code?
>>>
>>>I have read most Wiki entries and while the actual workings are 
>>>described quite nicely, I could not find an answer to the questions 
>>>above. The demos are already in place and can be started as is without 
>>>the need of making them available.
>>>
>>>Again, I apologize for being a noobie.
>>>
>>>Lars
>>>    
>>
>>  


Mime
View raw message