hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shashikant Banerjee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDDS-850) ReadStateMachineData hits OverlappingFileLockException in ContainerStateMachine
Date Fri, 23 Nov 2018 19:32:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697459#comment-16697459
] 

Shashikant Banerjee commented on HDDS-850:
------------------------------------------

Thanks [~msingh] for the review comments. Patch v3 addresses your review comments.

> ReadStateMachineData hits OverlappingFileLockException in ContainerStateMachine
> -------------------------------------------------------------------------------
>
>                 Key: HDDS-850
>                 URL: https://issues.apache.org/jira/browse/HDDS-850
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>    Affects Versions: 0.4.0
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>            Priority: Major
>             Fix For: 0.4.0
>
>         Attachments: HDDS-850.000.patch, HDDS-850.001.patch, HDDS-850.002.patch, HDDS-850.003.patch
>
>
> {code:java}
> 2018-11-16 09:54:41,599 ERROR org.apache.ratis.server.impl.LogAppender: GrpcLogAppender(0813f1a9-61be-4cab-aa05-d5640f4a8341
-> c6ad906f-7e71-4bac-bde3-d22bc1aa8c7d) hit IOException while loading raft log
> org.apache.ratis.server.storage.RaftLogIOException: 0813f1a9-61be-4cab-aa05-d5640f4a8341:
Failed readStateMachineData for (t:2, i:1), STATEMACHINELOGENTRY, client-7D19FB803B1E, cid=0
>         at org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:370)
>         at org.apache.ratis.server.impl.LogAppender$LogEntryBuffer.getAppendRequest(LogAppender.java:167)
>         at org.apache.ratis.server.impl.LogAppender.createRequest(LogAppender.java:216)
>         at org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:152)
>         at org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:96)
>         at org.apache.ratis.server.impl.LogAppender.runAppender(LogAppender.java:100)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.nio.channels.OverlappingFileLockException
>         at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:255)
>         at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:152)
>         at sun.nio.ch.AsynchronousFileChannelImpl.addToFileLockTable(AsynchronousFileChannelImpl.java:178)
>         at sun.nio.ch.SimpleAsynchronousFileChannelImpl.implLock(SimpleAsynchronousFileChannelImpl.java:185)
>         at sun.nio.ch.AsynchronousFileChannelImpl.lock(AsynchronousFileChannelImpl.java:118)
>         at org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.readData(ChunkUtils.java:178)
>         at org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerImpl.readChunk(ChunkManagerImpl.java:197)
>         at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleReadChunk(KeyValueHandler.java:542)
>         at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:174)
>         at org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:178)
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:290)
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.readStateMachineData(ContainerStateMachine.java:404)
>         at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$readStateMachineData$6(ContainerStateMachine.java:462)
>         at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         ... 1 more
> {code}
> This happens in the Ratis leader where the stateMachineData is not  in the cached segments
in Ratis while it gets a request for ReadStateMachineData while writeStateMachineData is not
completed yet. The approach would be to cache the stateMachineData inside ContainerStateMachine
and not cache it inside ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message