hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shashikant Banerjee (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDDS-263) Add retries in Ozone Client to handle BLOCK_NOT_COMMITTED Exception
Date Thu, 19 Jul 2018 09:15:00 GMT

     [ https://issues.apache.org/jira/browse/HDDS-263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shashikant Banerjee updated HDDS-263:
-------------------------------------
    Description: While Ozone client writes are going on, a container on a datanode can gets
closed because of node failures, disk out of space etc. In situations as such, client write
will fail with CLOSED_CONTAINER_IO. In this case, ozone client should try to get the committed
block length for the pending open blocks and update the OzoneManager. While trying to get
the committed block length, it may fail with BLOCK_NOT_COMMITTED exception as the as a part
of transiton from CLOSING to CLOSED state for the container , it commits all open blocks one
by one. In such cases, client needs to retry to get the committed block length for a fixed
no of attempts and eventually throw the exception to the application if its not able to successfully
get and update the length in the OzoneManager. This Jira aims to address this.  (was: While
Ozone client writes are going on, a container on a datanode can gets closed because of node
failures, disk out of space etc. In situations as such, client write will fail with CLOSED_CONTAINER_IO.
In this case, ozone client should try to get the committed block length for the pending open
blocks and update the OzoneManager. While trying to get the committed block length, it may
fail with BLOCK_NOT_COMMITTED exception as the as a part of transiton from CLOSING to CLOSED
state for the container , it commits all open blocks one by one. In such cases, client needs
to retry to get the committed block length for a fixed no of attempts and eventually throw
the exception to the application if its not able to successfully get and update the length
in the OzoneManager eventually. This Jira aims to address this.)

> Add retries in Ozone Client to handle BLOCK_NOT_COMMITTED Exception
> -------------------------------------------------------------------
>
>                 Key: HDDS-263
>                 URL: https://issues.apache.org/jira/browse/HDDS-263
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Client
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>            Priority: Major
>             Fix For: 0.2.1
>
>
> While Ozone client writes are going on, a container on a datanode can gets closed because
of node failures, disk out of space etc. In situations as such, client write will fail with
CLOSED_CONTAINER_IO. In this case, ozone client should try to get the committed block length
for the pending open blocks and update the OzoneManager. While trying to get the committed
block length, it may fail with BLOCK_NOT_COMMITTED exception as the as a part of transiton
from CLOSING to CLOSED state for the container , it commits all open blocks one by one. In
such cases, client needs to retry to get the committed block length for a fixed no of attempts
and eventually throw the exception to the application if its not able to successfully get
and update the length in the OzoneManager. This Jira aims to address this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message