hadoop-ozone-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (Jira)" <j...@apache.org>
Subject [jira] [Resolved] (HDDS-2646) Start acceptance tests only if at least one THREE pipeline is available
Date Wed, 04 Dec 2019 20:25:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Anu Engineer resolved HDDS-2646.
--------------------------------
    Fix Version/s: 0.5.0
       Resolution: Fixed

Committed to master. Thanks for the contribution.

> Start acceptance tests only if at least one THREE pipeline is available
> -----------------------------------------------------------------------
>
>                 Key: HDDS-2646
>                 URL: https://issues.apache.org/jira/browse/HDDS-2646
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Marton Elek
>            Assignee: Marton Elek
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.5.0
>
>         Attachments: docker-ozoneperf-ozoneperf-basic-scm.log
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> After HDDS-2034 (or even before?) pipeline creation (or the status transition from ALLOCATE
to OPEN) requires at least one pipeline report from all of the datanodes. Which means that
the cluster might not be usable even if it's out from the safe mode AND there are at least
three datanodes.
> It makes all the acceptance tests unstable.
> For example in [this|https://github.com/apache/hadoop-ozone/pull/263/checks?check_run_id=324489319]
run.
> {code:java}
> scm_1         | 2019-11-28 11:22:54,401 INFO pipeline.RatisPipelineProvider: Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb
create command to datanode 548f146f-2166-440a-b9f1-83086591ae26
> scm_1         | 2019-11-28 11:22:54,402 INFO pipeline.RatisPipelineProvider: Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb
create command to datanode dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c
> scm_1         | 2019-11-28 11:22:54,404 INFO pipeline.RatisPipelineProvider: Send pipeline:PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb
create command to datanode 47dbb8e4-bbde-4164-a798-e47e8c696fb5
> scm_1         | 2019-11-28 11:22:54,405 INFO pipeline.PipelineStateManager: Created pipeline
Pipeline[ Id: 8dc4aeb6-5ae2-46a0-948d-287c97dd81fb, Nodes: 548f146f-2166-440a-b9f1-83086591ae26{ip:
172.24.0.10, host: ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack,
certSerialId: null}dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip: 172.24.0.5, host: ozoneperf_datanode_1.ozoneperf_default,
networkLocation: /default-rack, certSerialId: null}47dbb8e4-bbde-4164-a798-e47e8c696fb5{ip:
172.24.0.2, host: ozoneperf_datanode_2.ozoneperf_default, networkLocation: /default-rack,
certSerialId: null}, Type:RATIS, Factor:THREE, State:ALLOCATED]
> scm_1         | 2019-11-28 11:22:56,975 INFO pipeline.PipelineReportHandler: Pipeline
THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 548f146f-2166-440a-b9f1-83086591ae26{ip:
172.24.0.10, host: ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack,
certSerialId: null}
> scm_1         | 2019-11-28 11:22:58,018 INFO pipeline.PipelineReportHandler: Pipeline
THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip:
172.24.0.5, host: ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack,
certSerialId: null}
> scm_1         | 2019-11-28 11:23:01,871 INFO pipeline.PipelineReportHandler: Pipeline
THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 548f146f-2166-440a-b9f1-83086591ae26{ip:
172.24.0.10, host: ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack,
certSerialId: null}
> scm_1         | 2019-11-28 11:23:02,817 INFO pipeline.PipelineReportHandler: Pipeline
THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by 548f146f-2166-440a-b9f1-83086591ae26{ip:
172.24.0.10, host: ozoneperf_datanode_3.ozoneperf_default, networkLocation: /default-rack,
certSerialId: null}
> scm_1         | 2019-11-28 11:23:02,847 INFO pipeline.PipelineReportHandler: Pipeline
THREE PipelineID=8dc4aeb6-5ae2-46a0-948d-287c97dd81fb reported by dccee7c4-19b3-41b8-a3f7-b47b0ed45f6c{ip:
172.24.0.5, host: ozoneperf_datanode_1.ozoneperf_default, networkLocation: /default-rack,
certSerialId: null} {code}
> As you can see the pipeline is created but the the cluster is not usable as it's not
yet reporter back by datanode_2:
> {code:java}
> scm_1         | 2019-11-28 11:23:13,879 WARN block.BlockManagerImpl: Pipeline creation
failed for type:RATIS factor:THREE. Retrying get pipelines c
> all once.
> scm_1         | org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot
create pipeline of factor 3 using 0 nodes.{code}
>  The quick fix is to configure all the compose clusters to wait until one pipeline is
available. This can be done by adjusting the number of the required datanodes:
> {code:java}
> // We only care about THREE replica pipeline
> int minHealthyPipelines = minDatanodes /
>     HddsProtos.ReplicationFactor.THREE_VALUE; {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org


Mime
View raw message