hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (HDDS-1451) SCMBlockManager findPipeline and createPipeline are not lock protected
Date Sat, 18 May 2019 09:55:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-1451?focusedWorklogId=244517&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-244517
]

ASF GitHub Bot logged work on HDDS-1451:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/May/19 09:54
            Start Date: 18/May/19 09:54
    Worklog Time Spent: 10m 
      Work Description: anuengineer commented on issue #799: HDDS-1451 : SCMBlockManager findPipeline
and createPipeline are not lock protected.
URL: https://github.com/apache/hadoop/pull/799#issuecomment-493664281
 
 
   @avijayanhwx  the patch looks good to me. Can you please confirm that the test failures
are not due to this patch. Thanks in Advance.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 244517)
    Time Spent: 1h 20m  (was: 1h 10m)

> SCMBlockManager findPipeline and createPipeline are not lock protected
> ----------------------------------------------------------------------
>
>                 Key: HDDS-1451
>                 URL: https://issues.apache.org/jira/browse/HDDS-1451
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: SCM
>    Affects Versions: 0.3.0
>            Reporter: Mukul Kumar Singh
>            Assignee: Aravindan Vijayan
>            Priority: Major
>              Labels: MiniOzoneChaosCluster, pull-request-available
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> SCM BlockManager may try to allocate pipelines in the cases when it is not needed. This
happens because BlockManagerImpl#allocateBlock is not lock protected, so multiple pipelines
can be allocated from it. One of the pipeline allocation can fail even when one of the existing
pipeline already exists.
> {code}
> 2019-04-22 22:34:14,336 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103))
-  pipeline Pipeline[ Id: 6f4bb2d7-d660-4f9f-bc06-72b10f9a738e, Nodes: 76e1a493-fd55-4d67-9f5
> 5-c04fd6bd3a33{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}2b9850b2-aed3-4a40-91b5-2447dc5246bf{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}12248721-ea6a-453f-8dad-fc7fbe692f
> d2{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE,
State:OPEN]
> 2019-04-22 22:34:14,386 INFO  impl.RoleInfo (RoleInfo.java:shutdownLeaderElection(134))
- e17b7852-4691-40c7-8791-ad0b0da5201f: shutdown LeaderElection
> 2019-04-22 22:34:14,388 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103))
-  pipeline Pipeline[ Id: 552e28f3-98d9-41f3-86e0-c1b9494838a5, Nodes: e17b7852-4691-40c7-879
> 1-ad0b0da5201f{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}fd365bac-e26e-4b11-afd8-9d08cd1b0521{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}9583a007-7f02-4074-9e26-19bc18e29e
> c5{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE,
State:OPEN]
> 2019-04-22 22:34:14,388 INFO  impl.RoleInfo (RoleInfo.java:updateAndGet(143)) - e17b7852-4691-40c7-8791-ad0b0da5201f:
start FollowerState
> 2019-04-22 22:34:14,388 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103))
-  pipeline Pipeline[ Id: 5383151b-d625-4362-a7dd-c0d353acaf76, Nodes: 80f16ad6-3879-4a64-a3c
> 7-7719813cc139{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}082ce481-7fb0-4f88-ac21-82609290a6a2{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}dd5f5a70-0217-4577-b7a2-c42aa139d1
> 8a{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE,
State:OPEN]
> 2019-04-22 22:34:14,389 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103))
-  pipeline Pipeline[ Id: be4854e5-7933-4caa-b32e-f482cf500247, Nodes: 6e2356f1-479d-498b-876
> a-1c90623c498b{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}8ac46d94-9975-4eea-9448-2618c69d7bf3{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}a3ed36a1-44ca-47b2-b9b3-5aeef04595
> 18{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE,
State:OPEN]
> 2019-04-22 22:34:14,390 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103))
-  pipeline Pipeline[ Id: 21e368e2-f82a-4c61-9cc3-06e8de22ea6b, Nodes: 82632040-5754-4122-b187-331879586842{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}923c8537-b869-4085-adcb-0a9accdcd089{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}c6d790bf-e3a6-4064-acb5-f74796cd38a9{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
> 2019-04-22 22:34:14,390 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103))
-  pipeline Pipeline[ Id: cccbc2ed-e0e2-4578-a8a2-94f4b645be52, Nodes: 91ae6848-a778-43be-a4a1-5855f7adc0d8{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}8f330a03-40e2-4bd1-9b43-5e05b13d89f0{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}4f3070dc-650b-48d7-87b5-d2076104e7b4{ip:
192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
> 2019-04-22 22:34:14,392 ERROR block.BlockManagerImpl (BlockManagerImpl.java:allocateBlock(192))
- Pipeline creation failed for type:RATIS factor:THREE
> org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot create pipeline
of factor 3 using 2 nodes 20 healthy nodes 20 all nodes.
>         at org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider.create(RatisPipelineProvider.java:122)
>         at org.apache.hadoop.hdds.scm.pipeline.PipelineFactory.create(PipelineFactory.java:57)
>         at org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.createPipeline(SCMPipelineManager.java:148)
>         at org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:190)
>         at org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:172)
>         at org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:82)
>         at org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:7533)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> 2019-04-22 22:34:14,395 ERROR block.BlockManagerImpl (BlockManagerImpl.java:allocateBlock(213))
- Unable to allocate a block for the size: 16384, type: RATIS, factor: THREE
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message