hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LiXin Ge (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-13086) Add a field about exception type to BlockOpResponseProto
Date Mon, 05 Feb 2018 08:32:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352126#comment-16352126
] 

LiXin Ge edited comment on HDFS-13086 at 2/5/18 8:31 AM:
---------------------------------------------------------

As described in the [design document|https://issues.apache.org/jira/browse/HDFS-347] of short-circuit
read below, disable the SCR is a simple way to interact with append, truncate, etc. operations.
{quote}Interaction with other features (e.g. Append)
 We should investigate whether (and how) this feature will interact with other ongoing work,
in particular appends. If there is any complication, it should be straightforward to simply
disable the fast path for any blocks currently under construction. Given that the primary
benefit for the fast path is in mapreduce jobs, and mapreduce jobs rarely run on under-construction
blocks, this seems reasonable and avoids a lot of complexity.
{quote}
Without a doubt that disable the SCR is a useful way to simplify the design and make the SCR
feature take wonderful effect at most of the time. But given that append may becomes a common
case in some situation and SCR was disabled frequently(see details in HDFS-12528), then it
doesn't fit the initial target scene of *{{rarely run on under-construction blocks}}* and
seems not so reasonable.


 The initial patch is attached, my initial ideas is add a exception type to BlockOpResponseProto
to identify various kinds of exceptions, but now that the main kind is FNFE, so I just add
a status to identify the specify case. I would be very grateful if I could get opinions
from [~xiaochen] and [~cmccabe].

 


was (Author: gelixin):
As described in the design document of short-circuit read below, disable the SCR is a simple
way to interact with append, truncate, etc. operations.
{quote}Interaction with other features (e.g. Append)
 We should investigate whether (and how) this feature will interact with other ongoing work,
in particular appends. If there is any complication, it should be straightforward to simply
disable the fast path for any blocks currently under construction. Given that the primary
benefit for the fast path is in mapreduce jobs, and mapreduce jobs rarely run on under-construction
blocks, this seems reasonable and avoids a lot of complexity.
{quote}
Without a doubt that disable the SCR is a useful way to simplify the design and make the SCR
feature take wonderful effect at most of the time. But given that append may becomes a common
case in some situation and SCR was disabled frequently(see details in HDFS-12528), then it
doesn't fit the initial target scene of *{{rarely run on under-construction blocks}}* and
seems not so reasonable.


 The initial patch is attached, my initial ideas is add a exception type to BlockOpResponseProto
to identify various kinds of exceptions, but now that the main kind is FNFE, so I just add
a status to identify the specify case. I would be very grateful if I could get opinions
from [~xiaochen] and [~cmccabe].

 

> Add a field about exception type to BlockOpResponseProto
> --------------------------------------------------------
>
>                 Key: HDFS-13086
>                 URL: https://issues.apache.org/jira/browse/HDFS-13086
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.1.0
>            Reporter: LiXin Ge
>            Assignee: LiXin Ge
>            Priority: Minor
>         Attachments: HDFS-13086.001.patch
>
>
> When user re-read a file in the way of short-circuit reads, it may come across unknown
errors due to the reasons that the file has been appended after the first read which changes
it's meta file, or the file has been moved away by the balancer.
> Such unknown errors will unnecessary disable short-circuit reads for 10 minutes. HDFS-12528 Make
the {{expireAfterWrite}} of {{DomainSocketFactory$pathMap}} configurable to give user
a choice of never disable the domain socket. 
> We can go a step further that add a field about exception type to BlockOpResponseProto,
so that we can Ignore the acceptable FNFE and set a appropriate disable time to handle the
unacceptable exceptions when different type of exception happens in the same cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message