hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hanisha Koneru (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDDS-609) SCM does not exit chill mode as it expects DNs to report containers in ALLOCATED state
Date Thu, 11 Oct 2018 19:25:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hanisha Koneru updated HDDS-609:
--------------------------------
    Summary: SCM does not exit chill mode as it expects DNs to report containers in ALLOCATED
state  (was: Mapreduce example fails with Allocate block failed, error:INTERNAL_ERROR)

> SCM does not exit chill mode as it expects DNs to report containers in ALLOCATED state
> --------------------------------------------------------------------------------------
>
>                 Key: HDDS-609
>                 URL: https://issues.apache.org/jira/browse/HDDS-609
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Namit Maheshwari
>            Priority: Major
>
> {code:java}
> -bash-4.2$ /usr/hdp/current/hadoop-client/bin/hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar
wordcount /tmp/mr_jobs/input/ o3://bucket2.volume2/mr_job5
> 18/10/09 23:37:07 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:08 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:08 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:08 INFO client.AHSProxy: Connecting to Application History server at ctr-e138-1518143905142-510793-01-000004.hwx.site/172.27.79.197:10200
> 18/10/09 23:37:08 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
> 18/10/09 23:37:09 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path:
/user/hdfs/.staging/job_1539125785626_0007
> 18/10/09 23:37:09 INFO input.FileInputFormat: Total input files to process : 1
> 18/10/09 23:37:09 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 18/10/09 23:37:09 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo
library [hadoop-lzo rev 5d6248d8d690f8456469979213ab2e9993bfa2e9]
> 18/10/09 23:37:09 INFO mapreduce.JobSubmitter: number of splits:1
> 18/10/09 23:37:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1539125785626_0007
> 18/10/09 23:37:09 INFO mapreduce.JobSubmitter: Executing with tokens: []
> 18/10/09 23:37:10 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:10 INFO conf.Configuration: found resource resource-types.xml at file:/etc/hadoop/3.0.3.0-63/0/resource-types.xml
> 18/10/09 23:37:10 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:10 INFO impl.YarnClientImpl: Submitted application application_1539125785626_0007
> 18/10/09 23:37:10 INFO mapreduce.Job: The url to track the job: http://ctr-e138-1518143905142-510793-01-000005.hwx.site:8088/proxy/application_1539125785626_0007/
> 18/10/09 23:37:10 INFO mapreduce.Job: Running job: job_1539125785626_0007
> 18/10/09 23:37:17 INFO mapreduce.Job: Job job_1539125785626_0007 running in uber mode
: false
> 18/10/09 23:37:17 INFO mapreduce.Job: map 0% reduce 0%
> 18/10/09 23:37:24 INFO mapreduce.Job: map 100% reduce 0%
> 18/10/09 23:37:29 INFO mapreduce.Job: Task Id : attempt_1539125785626_0007_r_000000_0,
Status : FAILED
> Error: java.io.IOException: Allocate block failed, error:INTERNAL_ERROR
> at org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:576)
> at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.allocateNewBlock(ChunkGroupOutputStream.java:475)
> at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleWrite(ChunkGroupOutputStream.java:271)
> at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.write(ChunkGroupOutputStream.java:250)
> at org.apache.hadoop.fs.ozone.OzoneFSOutputStream.write(OzoneFSOutputStream.java:47)
> at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:78)
> at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:93)
> at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:559)
> at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
> at org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:64)
> at org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:52)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:628)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> 18/10/09 23:37:35 INFO mapreduce.Job: Task Id : attempt_1539125785626_0007_r_000000_1,
Status : FAILED
> Error: java.io.IOException: Allocate block failed, error:INTERNAL_ERROR
> at org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:576)
> at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.allocateNewBlock(ChunkGroupOutputStream.java:475)
> at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleWrite(ChunkGroupOutputStream.java:271)
> at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.write(ChunkGroupOutputStream.java:250)
> at org.apache.hadoop.fs.ozone.OzoneFSOutputStream.write(OzoneFSOutputStream.java:47)
> at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:78)
> at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:93)
> at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:559)
> at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
> at org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:64)
> at org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:52)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:628)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> 18/10/09 23:37:42 INFO mapreduce.Job: Task Id : attempt_1539125785626_0007_r_000000_2,
Status : FAILED
> Error: java.io.IOException: Allocate block failed, error:INTERNAL_ERROR
> at org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:576)
> at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.allocateNewBlock(ChunkGroupOutputStream.java:475)
> at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleWrite(ChunkGroupOutputStream.java:271)
> at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.write(ChunkGroupOutputStream.java:250)
> at org.apache.hadoop.fs.ozone.OzoneFSOutputStream.write(OzoneFSOutputStream.java:47)
> at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:78)
> at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:93)
> at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:559)
> at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
> at org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:64)
> at org.apache.hadoop.examples.WordCount$IntSumReducer.reduce(WordCount.java:52)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:628)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:390)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> 18/10/09 23:37:49 INFO mapreduce.Job: map 100% reduce 100%
> 18/10/09 23:37:50 INFO mapreduce.Job: Job job_1539125785626_0007 failed with state FAILED
due to: Task failed task_1539125785626_0007_r_000000
> Job failed as tasks failed. failedMaps:0 failedReduces:1 killedMaps:0 killedReduces:
0
> 18/10/09 23:37:50 INFO conf.Configuration: Removed undeclared tags:
> 18/10/09 23:37:51 INFO mapreduce.Job: Counters: 45
> File System Counters
> FILE: Number of bytes read=0
> FILE: Number of bytes written=266505
> FILE: Number of read operations=0
> FILE: Number of large read operations=0
> FILE: Number of write operations=0
> HDFS: Number of bytes read=215876
> HDFS: Number of bytes written=0
> HDFS: Number of read operations=2
> HDFS: Number of large read operations=0
> HDFS: Number of write operations=0
> O3: Number of bytes read=0
> O3: Number of bytes written=0
> O3: Number of read operations=0
> O3: Number of large read operations=0
> O3: Number of write operations=0
> Job Counters
> Failed reduce tasks=4
> Launched map tasks=1
> Launched reduce tasks=4
> Rack-local map tasks=1
> Total time spent by all maps in occupied slots (ms)=18816
> Total time spent by all reduces in occupied slots (ms)=128016
> Total time spent by all map tasks (ms)=4704
> Total time spent by all reduce tasks (ms)=16002
> Total vcore-milliseconds taken by all map tasks=4704
> Total vcore-milliseconds taken by all reduce tasks=16002
> Total megabyte-milliseconds taken by all map tasks=19267584
> Total megabyte-milliseconds taken by all reduce tasks=131088384
> Map-Reduce Framework
> Map input records=716
> Map output records=32019
> Map output bytes=343475
> Map output materialized bytes=6332
> Input split bytes=121
> Combine input records=32019
> Combine output records=461
> Spilled Records=461
> Failed Shuffles=0
> Merged Map outputs=0
> GC time elapsed (ms)=96
> CPU time spent (ms)=3300
> Physical memory (bytes) snapshot=2528358400
> Virtual memory (bytes) snapshot=5420421120
> Total committed heap usage (bytes)=2714238976
> Peak Map Physical memory (bytes)=2528358400
> Peak Map Virtual memory (bytes)=5420421120
> File Input Format Counters
> Bytes Read=215755
> 18/10/09 23:37:51 INFO conf.Configuration: Removed undeclared tags:
> {code}
> SCM logs
> {code:java}
> 2018-10-09 23:37:28,984 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9863,
call Call#101 Retry#0 org.apache.hadoop.ozone.protocol.ScmBlockLocationProtocol.allocateScmBlock
from 172.27.56.9:33814
> org.apache.hadoop.hdds.scm.exceptions.SCMException: ChillModePrecheck failed for allocateBlock
> at org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:38)
> at org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:30)
> at org.apache.hadoop.hdds.scm.ScmUtils.preCheck(ScmUtils.java:42)
> at org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:191)
> at org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:143)
> at org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:74)
> at org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:6255)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> 2018-10-09 23:37:35,232 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9863,
call Call#103 Retry#0 org.apache.hadoop.ozone.protocol.ScmBlockLocationProtocol.allocateScmBlock
from 172.27.56.9:33814
> org.apache.hadoop.hdds.scm.exceptions.SCMException: ChillModePrecheck failed for allocateBlock
> at org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:38)
> at org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:30)
> at org.apache.hadoop.hdds.scm.ScmUtils.preCheck(ScmUtils.java:42)
> at org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:191)
> at org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:143)
> at org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:74)
> at org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:6255)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> 2018-10-09 23:37:42,044 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9863,
call Call#105 Retry#0 org.apache.hadoop.ozone.protocol.ScmBlockLocationProtocol.allocateScmBlock
from 172.27.56.9:33814
> org.apache.hadoop.hdds.scm.exceptions.SCMException: ChillModePrecheck failed for allocateBlock
> at org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:38)
> at org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:30)
> at org.apache.hadoop.hdds.scm.ScmUtils.preCheck(ScmUtils.java:42)
> at org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:191)
> at org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:143)
> at org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:74)
> at org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:6255)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> 2018-10-09 23:37:48,656 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9863,
call Call#107 Retry#0 org.apache.hadoop.ozone.protocol.ScmBlockLocationProtocol.allocateScmBlock
from 172.27.56.9:33814
> org.apache.hadoop.hdds.scm.exceptions.SCMException: ChillModePrecheck failed for allocateBlock
> at org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:38)
> at org.apache.hadoop.hdds.scm.server.ChillModePrecheck.check(ChillModePrecheck.java:30)
> at org.apache.hadoop.hdds.scm.ScmUtils.preCheck(ScmUtils.java:42)
> at org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:191)
> at org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:143)
> at org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:74)
> at org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:6255)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message