hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Large startup time in remote MapReduce job
Date Tue, 21 Jun 2011 20:31:23 GMT

If your jar does not contain code changes that need to get transmitted
every time, you can consider placing them on the JT/TT classpaths and
not do any jar registration in your job submission code. You'll see a
related WARN but it should be OK to ignore that.

If not, work on other ways to get your jar size reduced. Does it
really contain 20 MB worth of user code or is that with libraries?

On Wed, Jun 22, 2011 at 1:57 AM, Gabor Makrai <makrai.list@gmail.com> wrote:
> Hi everyone,
> I have a little problem with running MapReduce jobs.
> I have a pretty large Java program (my jar size is more than 20MB) , where I
> implemented a MapReduce job. I tested it in my local cluster, and it worked
> fine. But I tried it with low-bandwith Internet access and I experienced
> very-very slow job starting time :( I guess my whole JAR file was uploaded,
> because I experienced unusual upgoing network traffic.
> Could anyone tell me how can I solve this problem?
> Thanks,
> Gabor

Harsh J

View raw message