hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shashikant Banerjee (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDDS-801) Quasi close the container when close is not executed via Ratis
Date Wed, 14 Nov 2018 14:25:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16686506#comment-16686506
] 

Shashikant Banerjee edited comment on HDDS-801 at 11/14/18 2:24 PM:
--------------------------------------------------------------------

Thanks [~nandakumar131] for working on this. In addition to Mukul's comments ,some more comments
:

1.KeyValueHandler.java : 865  -> update the comment to be container getting "quasi closed"
rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in ContainerProtocolCalls.Java.
With SCMCLi as well, the close container can be invoked where a client can directly close
(closeContainer in ContainerOperationClient). In such cases, a container in may be in just
open state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
      INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
      container.getContainerData().getContainerID() + " while in " +
      state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, closeContainer/quasiCloseContainer
call should call updateContainerState internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a follower before
sending a closeCommand but Ratis ring is opeartional. In such case, leader will execute the
closeContainer transaction  successfully and follower will try to replicate the same but
it will fail as the container was never put into closing state in follower before as it was
not communicating with SCM. The assumption that container will be in closing state before
closeContainer is called may not be necessarily true always.

5. KeyValueContainer.java : 310 ->

The comments look misleading here. The first comment specifies the compaction should be done
asynchronously as otherwise it will be lot slower . The next comment says it is ok if the
operation is slow. Can you please check?
{code:java}
@Override
public void close() throws StorageContainerException {

  //TODO: writing .container file and compaction can be done
  // asynchronously, otherwise rpc call for this will take a lot of time to
  // complete this action
  ContainerDataProto.State oldState = null;
  try {
    writeLock();
    oldState = containerData.getState();
    containerData.closeContainer();
    File containerFile = getContainerFile();
    // update the new container data to .container File
    updateContainerFile(containerFile);

  } catch (StorageContainerException ex) {
    if (oldState != null) {
      // Failed to update .container file. Reset the state to CLOSING
      containerData.setState(oldState);
    }
    throw ex;
  } finally {
    writeUnlock();
  }
  // It is ok if this operation takes a bit of time.
  // Close container is not expected to be instantaneous.
  compactDB();
}

{code}


was (Author: shashikant):
Thanks [~nandakumar131] for working on this. In addition to Mukul's comments ,some more comments
:

1.KeyValueHandler.java : 865  -> update the comment to be container getting "quasi closed"
rather than getting closed.

2.KeyValueHandler.java : 865 -> closeContainer is exposed to clients in ContainerProtocolCalls.Java.
With SCMCLi as well, the close container can be invoked where a client can directly close
(closeContainer in ContainerOperationClient). In such cases, a container in may be in just
open state and hence the exception will be thrown:
{code:java}
// The container has to be in CLOSING state.
if (state != State.CLOSING) {
  ContainerProtos.Result error = state == State.INVALID ?
      INVALID_CONTAINER_STATE : CONTAINER_INTERNAL_ERROR;
  throw new StorageContainerException("Cannot close container #" +
      container.getContainerData().getContainerID() + " while in " +
      state + " state.", error);
}{code}
Should we disallow/remove the closeContainer call exposed to clients/SCMCLI?

3. Any state change in ContainerState should triggerICR.In that case, closeContainer/quasiCloseContainer
call should call updateContainerState internally to send ICR instead of executing individually.

4. There can be a case where let's say the SCM gets network separated from a follower before
sending a closeCommand but Ratis ring is opeartional. In such case, leader will execute the
closeContainer transaction  successfully and follower will try to replicate the same but
it will fail as the container was never put into closing state in follower before as it was
not communicating with SCM. The assumption that container will be in closing state before
closeContainer is called may not be necessarily true always.

5. KeyValueContainer.java : 310 ->

The comments look misleading here. The first comment specifies we compaction should be done
asynchronously as otherwise it will be lot slower . The next comment says it is ok if thhe
opeartion is slow. Can you please check?
{code:java}
@Override
public void close() throws StorageContainerException {

  //TODO: writing .container file and compaction can be done
  // asynchronously, otherwise rpc call for this will take a lot of time to
  // complete this action
  ContainerDataProto.State oldState = null;
  try {
    writeLock();
    oldState = containerData.getState();
    containerData.closeContainer();
    File containerFile = getContainerFile();
    // update the new container data to .container File
    updateContainerFile(containerFile);

  } catch (StorageContainerException ex) {
    if (oldState != null) {
      // Failed to update .container file. Reset the state to CLOSING
      containerData.setState(oldState);
    }
    throw ex;
  } finally {
    writeUnlock();
  }
  // It is ok if this operation takes a bit of time.
  // Close container is not expected to be instantaneous.
  compactDB();
}

{code}

> Quasi close the container when close is not executed via Ratis
> --------------------------------------------------------------
>
>                 Key: HDDS-801
>                 URL: https://issues.apache.org/jira/browse/HDDS-801
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: Ozone Datanode
>    Affects Versions: 0.3.0
>            Reporter: Nanda kumar
>            Assignee: Nanda kumar
>            Priority: Major
>         Attachments: HDDS-801.000.patch, HDDS-801.001.patch, HDDS-801.002.patch
>
>
> When datanode received CloseContainerCommand and the replication type is not RATIS, we
should QUASI close the container. After quasi-closing the container an ICR has to be sent
to SCM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message