hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@apache.org>
Subject Re: Large startup time in remote MapReduce job
Date Tue, 21 Jun 2011 20:58:38 GMT

On Jun 21, 2011, at 1:31 PM, Harsh J wrote:

> Gabor,
> If your jar does not contain code changes that need to get transmitted
> every time, you can consider placing them on the JT/TT classpaths

	... which means you get to bounce your system every time you change code.

> and
> not do any jar registration in your job submission code. You'll see a
> related WARN but it should be OK to ignore that.
> If not, work on other ways to get your jar size reduced. Does it
> really contain 20 MB worth of user code or is that with libraries?

	Harsh is on the right track.

	Break your jar up into multiple chunks, putting the fairly static pieces into a distributed
cache.  See http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F
for more info.

View raw message