hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From francexo83 <francex...@gmail.com>
Subject Re: MR job fails with too many mappers
Date Wed, 19 Nov 2014 09:14:21 GMT
Thank you very much for your suggestion, it was very helpful.

This is what I have after  turning off log aggregation:

2014-11-18 18:39:01,507 INFO [main]
org.apache.hadoop.service.AbstractService: Service
org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state STARTED;
cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
java.io.IOException: Split metadata size exceeded 10000000. Aborting job
job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1551)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1406)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1373)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:986)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1249)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1049)
        at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1460)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1456)
        at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1389)
Caused by: java.io.IOException: Split metadata size exceeded 10000000.
Aborting job job_1416332245344_0004
        at
org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:53)
        at
org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1546)


I exceeded the split metadata size so I  added the following property into
the mapred-site.xml and it worked:

<property>
    <name>mapreduce.job.split.metainfo.maxsize</name>
    <value>500000000</value>
</property>

thanks again.








2014-11-18 17:59 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com>:

>  If log aggregation is enabled, log folder will be deleted. So I suggest
> disable “yarn.log-aggregation-enable” and run job again. All the logs
> remains at log folder. Then you can find container logs
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> This e-mail and its attachments contain confidential information from
> HUAWEI, which is intended only for the person or entity whose address is
> listed above. Any use of the information contained herein in any way
> (including, but not limited to, total or partial disclosure, reproduction,
> or dissemination) by persons other than the intended recipient(s) is
> prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 22:15
> *To:* user@hadoop.apache.org
> *Subject:* Re: MR job fails with too many mappers
>
>
>
> Hi,
>
>
>
> thank you for your quick response, but I was not able to see the logs for
> the container.
>
>
>
> I get a  "no such file or directory" when I try to access the logs of the
> container from the shell:
>
>
>
> cd /var/log/hadoop-yarn/containers/application_1416304409718_0032
>
>
>
>
>
> It seems that the container has never been created.
>
>
>
>
>
>
>
> thanks
>
>
>
>
>
>
>
>
>
>
> 2014-11-18 16:43 GMT+01:00 Rohith Sharma K S <rohithsharmaks@huawei.com>:
>
> Hi
>
>
>
> Could you get syserr and sysout log for contrainer.? These logs will be
> available in the same location  syslog for container.
>
> ${yarn.nodemanager.log-dirs}/<app-id>/<container-id>
>
> This helps to find problem!!
>
>
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* francexo83 [mailto:francexo83@gmail.com]
> *Sent:* 18 November 2014 20:53
> *To:* user@hadoop.apache.org
> *Subject:* MR job fails with too many mappers
>
>
>
> Hi All,
>
>
>
> I have a small  hadoop cluster with three nodes and HBase 0.98.1 installed
> on it.
>
>
>
> The hadoop version is 2.3.0 and below my use case scenario.
>
>
>
> I wrote a map reduce program that reads data from an hbase table and does
> some transformations on these data.
>
> Jobs are very simple so they didn't need the  reduce phase. I also wrote a
> TableInputFormat  extension in order to maximize the number of concurrent
> maps on the cluster.
>
> In other words, each  row should be processed by a single map task.
>
>
>
> Everything goes well until the number of rows and consequently  mappers
> exceeds 300000 quota.
>
>
>
> This is the only exception I see when the job fails:
>
>
>
> Application application_1416304409718_0032 failed 2 times due to AM
> Container for appattempt_1416304409718_0032_000002 exited with exitCode: 1
> due to:
>
>
>
>
>
> Exception from container-launch:
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> org.apache.hadoop.util.Shell$ExitCodeException:
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:511)
>
> at org.apache.hadoop.util.Shell.run(Shell.java:424)
>
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:656)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Container exited with a non-zero exit code 1
>
>
>
>
>
> Cluster configuration details:
>
> Node1: 12 GB, 4 core
>
> Node2: 6 GB, 4 core
>
> Node3: 6 GB, 4 core
>
>
>
> yarn.scheduler.minimum-allocation-mb=2048
>
> yarn.scheduler.maximum-allocation-mb=4096
>
> yarn.nodemanager.resource.memory-mb=6144
>
>
>
>
>
>
>
> Regards
>
>
>

Mime
View raw message