hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5776) Support 'hedged' reads in DFSClient
Date Sat, 18 Jan 2014 00:57:19 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875453#comment-13875453
] 

stack commented on HDFS-5776:
-----------------------------

This looks like copy paste issue:

+      // assert that there were no quorum reads. 60ms + delta < 100ms
+      assertTrue(metrics.getParallelReadOps() > 0);

Here you are asserting that there IS // 'hedged' reads going on.

I was wondering if 'hedged requests' a good name for this feature and going by the definition
from your citation, it is good by me:

{quote}
Hedged requests. A simple way to curb latency variability is to issue the same request to
multiple replicas and use the results from whichever replica responds first. We term such
requests "hedged requests" because a client first sends one request to the replica believed
to be the most appropriate, but then falls back on sending a secondary request after some
brief delay. The client cancels remaining outstanding requests once the first result is received.
Although naive implementations of this technique typically add unacceptable additional load,
many variations exist that give most of the latency-reduction effects while increasing load
only modestly.

One such approach is to defer sending a secondary request until the first request has been
outstanding for more than the 95th-percentile expected latency for this class of requests.
This approach limits the additional load to approximately 5% while substantially shortening
the latency tail. ....
{quote}
http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/fulltext

We'd change this to be allowHedgeReads?

+  public volatile boolean allowParallelReads = false;

This would be hedgedReadThresholdMillis

+  private volatile long quorumReadThresholdMillis;

... and so on.

Or do you see parallel reads as different from hedged reads?  ('Tied requests' from your citation).

These are public so clients like hbase can tinker with them:

{code}
+
+  public void setQuorumReadTimeout(long timeoutMillis) {
+    this.quorumReadThresholdMillis = timeoutMillis;
+  }
+
+  public long getQuorumReadTimeout() {
+    return this.quorumReadThresholdMillis;
+  }
+
+  public void enableParallelReads() {
+    allowParallelReads = true;
+  }
+
+  public void disableParallelReads() {
+    allowParallelReads = false;
+  }
{code}

So if the requests are > the configured number, we run in the current thread which is better
than rejecting the request... What happens when we go beyond this allowance?  We throw the
rejected exception?  When would there be more than the configured number of // reads going
on?

Should these be public?

+  public ThreadPoolExecutor getParallelReadsThreadPool() {
+    return parallelReadsThreadPool;
+  }
+
+  public DFSQuorumReadMetrics getQuorumReadMetrics() {


Will have to rename this class? DFSQuorumReadMetrics

Add class comment on +public class DFSQuorumReadMetrics { saying what the metric means (since
there is no description for the metrics it seems).

Need a space in here?  "+public class DFSQuorumReadMetrics {"

I need to spend more time on the DFSClient changes but above should do for a first cut at
a review.   Thanks [~xieliang007]

> Support 'hedged' reads in DFSClient
> -----------------------------------
>
>                 Key: HDFS-5776
>                 URL: https://issues.apache.org/jira/browse/HDFS-5776
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HDFS-5776-v2.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & "dfs.dfsclient.quorum.read.threadpool.size"
to enable/disable the hedged read ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics,
we could export the interested metric valus into client system(e.g. HBase's regionserver metric).
> The core logic is in pread code path, we decide to goto the original fetchBlockByteRange
or the new introduced fetchBlockByteRangeSpeculative per the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message