hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris MacKenzie <stu...@chrismackenziephotography.co.uk>
Subject Re: Configuration set up questions - Container killed on request. Exit code is 143
Date Mon, 21 Jul 2014 08:38:57 GMT
Thanks Ozawa


Regards,

Chris MacKenzie
 <http://www.chrismackenziephotography.co.uk/>Expert in all aspects of
photography
telephone: 0131 332 6967 <tel:0131 332 6967>
email: studio@chrismackenziephotography.co.uk
corporate: www.chrismackenziephotography.co.uk
<http://www.chrismackenziephotography.co.uk/>
weddings: www.wedding.chrismackenziephotography.co.uk
<http://www.wedding.chrismackenziephotography.co.uk/>
 <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
<http://twitter.com/#!/MacKenzieStudio>
<http://www.facebook.com/pages/Chris-MacKenzie-Photography/145946284250>
<http://www.linkedin.com/in/chrismackenziephotography/>
<http://pinterest.com/ChrisMacKenzieP/>




On 18/07/2014 18:07, "Tsuyoshi OZAWA” <	> wrote:

>Hi Chris MacKenzie,
>
>How about trying as follows to identify the reason of your problem?
>
>1. Making both yarn.nodemanager.pmem-check-enabled and
>yarn.nodemanager.vmem-check-enabled false
>2. Making yarn.nodemanager.pmem-check-enabled true
>3. Making yarn.nodemanager.pmem-check-enabled true and
>yarn.nodemanager.vmem-pmem-ratio large value(e.g. 100)
>4. Making yarn.nodemanager.pmem-check-enabled true and
>yarn.nodemanager.vmem-pmem-ratio expected value(e.g. 2.1 or something)
>
>If there is problem on 1, the reason may be JVM configuration problem
>or another issue. If there is problem on 2, the reason is shortage of
>physical memory.
>
>Thanks,
>- Tsuyoshi
>
>
>On Fri, Jul 18, 2014 at 6:52 PM, Chris MacKenzie
><studio@chrismackenziephotography.co.uk> wrote:
>> Hi Guys,
>>
>> Thanks very much for getting back to me.
>>
>>
>> Thanks Chris - the idea of slitting the data is a great suggestion.
>> Yes Wangda, I was restarting after changing the configs
>>
>> I’ve been checking the relationship between what I thought was in my
>> config files and what hadoop thought were in them.
>>
>> With:
>>
>> // Print out Config file settings for testing.
>>                 for (Entry<String, String> entry: conf){
>>         System.out.printf("%s=%s\n", entry.getKey(), entry.getValue());
>>                 }
>>
>>
>>
>> There were anomalies ;0(
>>
>> Now that my hadoop reflects the values that are in my config files - I
>> just get the message “Killed” without any explanation.
>>
>>
>> Unfortunately, where I was applying changes incrementally and testing
>>I’ve
>> applied all the changes all at once.
>>
>> I’m now backing out the changes I made slowly to see where it starts to
>> reflect what I expect.
>>
>> Regards,
>>
>> Chris MacKenzie
>> telephone: 0131 332 6967
>> email: studio@chrismackenziephotography.co.uk
>> corporate: www.chrismackenziephotography.co.uk
>> <http://www.chrismackenziephotography.co.uk/>
>> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
>> <http://www.linkedin.com/in/chrismackenziephotography/>
>>
>>
>>
>>
>>
>>
>> From:  Chris Mawata <chris.mawata@gmail.com>
>> Reply-To:  <user@hadoop.apache.org>
>> Date:  Thursday, 17 July 2014 16:15
>> To:  <user@hadoop.apache.org>
>> Subject:  Re: Configuration set up questions - Container killed on
>> request. Exit code is 143
>>
>>
>> Another thing to try is smaller input splits if your data can be broken
>>up
>> into smaller files that can be independently processed. That way s
>> you get more but smaller map tasks. You could also use more  but smaller
>> reducers. The many files will tax your NameNode more but you might get
>>to
>> use all you cores.
>> On Jul 17, 2014 9:07 AM, "Chris MacKenzie"
>> <studio@chrismackenziephotography.co.uk> wrote:
>>
>> Hi Chris,
>>
>> Thanks for getting back to me. I will set that value to 10
>>
>> I have just tried this.
>> 
>>https://support.gopivotal.com/hc/en-us/articles/201462036-Mapreduce-YARN-
>>Me
>> mory-Parameters
>>
>> Setting both to mapreduce.map.memory.mb mapreduce.reduce.memory.mb.
>>Though
>> after setting it I didn’t get the expected change.
>>
>> As the output was still 2.1 GB of 2.1 GB virtual memory used. Killing
>> container
>>
>>
>> Regards,
>>
>> Chris MacKenzie
>> telephone: 0131 332 6967
>> email: studio@chrismackenziephotography.co.uk
>> corporate: www.chrismackenziephotography.co.uk
>> <http://www.chrismackenziephotography.co.uk>
>> <http://www.chrismackenziephotography.co.uk/>
>> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
>> <http://www.linkedin.com/in/chrismackenziephotography/>
>>
>>
>>
>>
>>
>>
>> From:  Chris Mawata <chris.mawata@gmail.com>
>> Reply-To:  <user@hadoop.apache.org>
>> Date:  Thursday, 17 July 2014 13:36
>> To:  Chris MacKenzie <studio@chrismackenziephotography.co.uk>
>> Cc:  <user@hadoop.apache.org>
>> Subject:  Re: Configuration set up questions - Container killed on
>> request. Exit code is 143
>>
>>
>> Hi Chris MacKenzie,      I have a feeling (I am not familiar with the
>>kind
>> of work you are doing) that your application is memory intensive.  8
>>cores
>> per node and only 12GB is tight. Try bumping up the
>> yarn.nodemanager.vmem-pmem-ratio
>> Chris Mawata
>>
>>
>>
>>
>> On Wed, Jul 16, 2014 at 11:37 PM, Chris MacKenzie
>> <studio@chrismackenziephotography.co.uk> wrote:
>>
>> Hi,
>>
>> Thanks Chris Mawata
>> I’m working through this myself, but wondered if anyone could point me
>>in
>> the right direction.
>>
>> I have attached my configs.
>>
>>
>> I’m using hadoop 2.41
>>
>> My system is:
>> 32 Clusters
>> 8 processors per machine
>> 12 gb ram
>> Available disk space per node 890 gb
>>
>> This is my current error:
>>
>> mapreduce.Job (Job.java:printTaskEvents(1441)) - Task Id :
>> attempt_1405538067846_0006_r_000000_1, Status : FAILED
>> Container [pid=25848,containerID=container_1405538067846_0006_01_000004]
>> is running beyond virtual memory limits. Current usage: 439.0 MB of 1 GB
>> physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing
>> container.
>> Dump of the process-tree for container_1405538067846_0006_01_000004 :
>>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>>         |- 25853 25848 25848 25848 (java) 2262 193 2268090368 112050
>> /usr/java/latest//bin/java -Djava.net.preferIPv4Stack=true
>> -Dhadoop.metrics.log.level=WARN -Xmx768m
>> 
>>-Djava.io.tmpdir=/tmp/hadoop-cm469/nm-local-dir/usercache/cm469/appcache/
>>ap
>> plication_1405538067846_0006/container_1405538067846_0006_01_000004/tmp
>> -Dlog4j.configuration=container-log4j.properties
>> 
>>-Dyarn.app.container.log.dir=/scratch/extra/cm469/hadoop-2.4.1/logs/userl
>>og
>> s/application_1405538067846_0006/container_1405538067846_0006_01_000004
>> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>> org.apache.hadoop.mapred.YarnChild 137.195.143.103 59056
>> attempt_1405538067846_0006_r_000000_1 4
>>         |- 25848 25423 25848 25848 (bash) 0 0 108613632 333 /bin/bash -c
>> /usr/java/latest//bin/java -Djava.net.preferIPv4Stack=true
>> -Dhadoop.metrics.log.level=WARN  -Xmx768m
>> 
>>-Djava.io.tmpdir=/tmp/hadoop-cm469/nm-local-dir/usercache/cm469/appcache/
>>ap
>> plication_1405538067846_0006/container_1405538067846_0006_01_000004/tmp
>> -Dlog4j.configuration=container-log4j.properties
>> 
>>-Dyarn.app.container.log.dir=/scratch/extra/cm469/hadoop-2.4.1/logs/userl
>>og
>> s/application_1405538067846_0006/container_1405538067846_0006_01_000004
>> -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
>> org.apache.hadoop.mapred.YarnChild 137.195.143.103 59056
>> attempt_1405538067846_0006_r_000000_1 4
>> 
>>1>/scratch/extra/cm469/hadoop-2.4.1/logs/userlogs/application_14055380678
>>46
>> _0006/container_1405538067846_0006_01_000004/stdout
>> 
>>2>/scratch/extra/cm469/hadoop-2.4.1/logs/userlogs/application_14055380678
>>46
>> _0006/container_1405538067846_0006_01_000004/stderr
>>
>> Container killed on request. Exit code is 143
>> Container exited with a non-zero exit code 143
>>
>>
>>
>>
>>
>>
>> Regards,
>>
>> Chris MacKenzie
>> telephone: 0131 332 6967
>> email: studio@chrismackenziephotography.co.uk
>> corporate: www.chrismackenziephotography.co.uk
>> <http://www.chrismackenziephotography.co.uk>
>> <http://www.chrismackenziephotography.co.uk>
>> <http://www.chrismackenziephotography.co.uk/>
>> <http://plus.google.com/+ChrismackenziephotographyCoUk/posts>
>> <http://www.linkedin.com/in/chrismackenziephotography/>
>>
>>
>>
>>
>>
>>
>> From:  Chris Mawata <chris.mawata@gmail.com>
>> Reply-To:  <user@hadoop.apache.org>
>> Date:  Thursday, 17 July 2014 02:10
>> To:  <user@hadoop.apache.org>
>> Subject:  Re: Can someone shed some light on this ? -
>>java.io.IOException:
>> Spill failed
>>
>>
>> I would post the configuration files -- easier for someone to spot
>> something wrong than to imagine what configuration would get you to that
>> stacktrace. The part
>> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
>> not find any valid local directory for
>> attempt_1405523201400_0006_m_000000_0_spill_8.out
>>
>> would suggest you might not have hadoop.tmp.dir set (?)
>>
>>
>>
>> On Wed, Jul 16, 2014 at 1:02 PM, Chris MacKenzie
>> <studio@chrismackenziephotography.co.uk> wrote:
>>
>> Hi,
>>
>> Is this a coding or a setup issue ?
>>
>> I¹m using Hadoop 2.41
>> My program is doing a concordance on 500,000 sequences of 400 chars.
>> My cluster set is 32 data nodes and two masters.
>>
>> The exact error is:
>> Error: java.io.IOException: Spill failed
>>         at
>> 
>>org.apache.hadoop.mapred.MapTask$MapOutputBuffer.checkSpillException(MapT
>>as
>> k.java:1535)
>>         at
>> 
>>org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:106
>>2)
>>         at
>> 
>>org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:69
>>2)
>>         at
>> 
>>org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInp
>>ut
>> OutputContextImpl.java:89)
>>         at
>> 
>>org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMa
>>pp
>> er.java:112)
>>         at
>> 
>>par.gene.align.v3.concordance.ConcordanceMapper.map(ConcordanceMapper.jav
>>a:
>> 96)
>>         at
>> 
>>par.gene.align.v3.concordance.ConcordanceMapper.map(ConcordanceMapper.jav
>>a:
>> 1)
>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>>         at 
>>org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.j
>> ava:1556)
>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>> Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could
>> not find any valid local directory for
>> attempt_1405523201400_0006_m_000000_0_spill_8.out
>>         at
>> 
>>org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathFo
>>rW
>> rite(LocalDirAllocator.java:402)
>>         at
>> 
>>org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAlloc
>>at
>> or.java:150)
>>         at
>> 
>>org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAlloc
>>at
>> or.java:131)
>>         at
>> 
>>org.apache.hadoop.mapred.YarnOutputFiles.getSpillFileForWrite(YarnOutputF
>>il
>> es.java:159)
>>         at
>> 
>>org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.jav
>>a:
>> 1566)
>>         at
>> 
>>org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$900(MapTask.java:
>>85
>> 3)
>>         at
>> 
>>org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.
>>ja
>> va:1505)
>>
>> Regards,
>>
>> Chris
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
>-- 
>- Tsuyoshi



Mime
View raw message