hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shashikant Banerjee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDDS-1753) Datanode unable to find chunk while replication data using ratis.
Date Fri, 16 Aug 2019 09:32:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908895#comment-16908895
] 

Shashikant Banerjee commented on HDDS-1753:
-------------------------------------------

Uploaded patch v0 to address the issue. The patch is rebased on top of HDDS-1610.

> Datanode unable to find chunk while replication data using ratis.
> -----------------------------------------------------------------
>
>                 Key: HDDS-1753
>                 URL: https://issues.apache.org/jira/browse/HDDS-1753
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>    Affects Versions: 0.4.0
>            Reporter: Mukul Kumar Singh
>            Assignee: Shashikant Banerjee
>            Priority: Major
>              Labels: MiniOzoneChaosCluster
>         Attachments: HDDS-1753.000.patch
>
>
> Leader datanode is unable to read chunk from the datanode while replicating data from
leader to follower.
> Please note that deletion of keys is also happening while the data is being replicated.
> {code}
> 2019-07-02 19:39:22,604 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972))
- 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#70:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 ERROR impl.ChunkManagerImpl (ChunkUtils.java:readData(161)) -
Unable to find the chunk file. chunk info : ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3
> -4d64-93d8-fa2ebafee933_chunk_1, offset=0, len=2048}
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(990))
- 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot (9770) already
h
> as the append entries (first index: 1)
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972))
- 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#71:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 2019-07-02 19:39:22,605 INFO  keyvalue.KeyValueHandler (ContainerUtils.java:logAndReturnError(146))
- Operation: ReadChunk : Trace ID: 4216d461a4679e17:4216d461a4679e17:0:0 : Message: Unable
to find the c
> hunk file. chunk info ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
offset=0, len=2048} : Result: UNABLE_TO_FIND_CHUNK
> 2019-07-02 19:39:22,605 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(990))
- 5ac88709-a3a2-4c8f-91de-5e54b617f05e: Failed appendEntries as latest snapshot (9770) already
h
> as the append entries (first index: 2)
> 2019-07-02 19:39:22,606 INFO  impl.RaftServerImpl (RaftServerImpl.java:checkInconsistentAppendEntries(972))
- 5ac88709-a3a2-4c8f-91de-5e54b617f05e: inconsistency entries. Reply:76a3eb0f-d7cd-477b-8973-db1
> 014feb398<-5ac88709-a3a2-4c8f-91de-5e54b617f05e#72:FAIL,INCONSISTENCY,nextIndex:9771,term:2,followerCommit:9782
> 19:39:22.606 [pool-195-thread-19] ERROR DNAudit - user=null | ip=null | op=READ_CHUNK
{blockData=conID: 3 locID: 102372189549953034 bcsId: 0} | ret=FAILURE
> java.lang.Exception: Unable to find the chunk file. chunk info ChunkInfo{chunkName='76ec669ae2cb6e10dd9f08c0789c5fdf_stream_a2850dce-def3-4d64-93d8-fa2ebafee933_chunk_1,
offset=0, len=2048}
>         at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:320)
~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
>         at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148)
~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:346)
~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.readStateMachineData(ContainerStateMachine.java:476)
~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$getCachedStateMachineData$2(ContainerStateMachine.java:495)
~[hadoop-hdds-container-service-0.5.0-SN
> APSHOT.jar:?]
>         at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
~[guava-11.0.2.jar:?]
>         at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
~[guava-11.0.2.jar:?]
>         at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
~[guava-11.0.2.jar:?]
>         at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
~[guava-11.0.2.jar:?]
>         at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) ~[guava-11.0.2.jar:?]
>         at com.google.common.cache.LocalCache.get(LocalCache.java:3965) ~[guava-11.0.2.jar:?]
>         at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
~[guava-11.0.2.jar:?]
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.getCachedStateMachineData(ContainerStateMachine.java:494)
~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.ja
> r:?]
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$readStateMachineData$4(ContainerStateMachine.java:542)
~[hadoop-hdds-container-service-0.5.0-SNAPSHOT.jar:?]
>         at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
[?:1.8.0_171]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_171]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_171]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message