hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-5776) Support 'hedged' reads in DFSClient
Date Wed, 19 Feb 2014 20:04:32 GMT

     [ https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HDFS-5776:

    Release Note: 
If a read from a block is slow, start up another parallel, 'hedged' read against a different
block replica.  We then take the result of which ever read returns first (the outstanding
read is cancelled).  This 'hedged' read feature will help rein in the outliers, the odd read
that takes a long time because it hit a bad patch on the disc, etc.

This feature is off by default.  To enable this feature, set <code>dfs.client.hedged.read.threadpool.size</code>
to a positive number.  The threadpool size is how many threads to dedicate to the running
of these 'hedged', concurrent reads in your client.

Then set <code>dfs.client.hedged.read.threshold.millis</code> to the number of
milliseconds to wait before starting up a 'hedged' read.  For example, if you set this property
to 10, then if a read has not returned within 10 milliseconds, we will start up a new read
against a different block replica.

This feature emits new metrics:

+ hedgedReadOps
+ hedgeReadOpsWin -- how many times the hedged read 'beat' the original read
+ hedgedReadOpsInCurThread -- how many times we went to do a hedged read but we had to run
it in the current thread because dfs.client.hedged.read.threadpool.size was at a maximum.

> Support 'hedged' reads in DFSClient
> -----------------------------------
>                 Key: HDFS-5776
>                 URL: https://issues.apache.org/jira/browse/HDFS-5776
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HDFS-5776-v10.txt, HDFS-5776-v11.txt, HDFS-5776-v12.txt, HDFS-5776-v12.txt,
HDFS-5776-v13.wip.txt, HDFS-5776-v14.txt, HDFS-5776-v15.txt, HDFS-5776-v17.txt, HDFS-5776-v17.txt,
HDFS-5776-v2.txt, HDFS-5776-v3.txt, HDFS-5776-v4.txt, HDFS-5776-v5.txt, HDFS-5776-v6.txt,
HDFS-5776-v7.txt, HDFS-5776-v8.txt, HDFS-5776-v9.txt, HDFS-5776.txt, HDFS-5776v18.txt, HDFS-5776v21.txt
> This is a placeholder of hdfs related stuff backport from https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & "dfs.dfsclient.quorum.read.threadpool.size"
to enable/disable the hedged read ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics,
we could export the interested metric valus into client system(e.g. HBase's regionserver metric).
> The core logic is in pread code path, we decide to goto the original fetchBlockByteRange
or the new introduced fetchBlockByteRangeSpeculative per the above config items.

This message was sent by Atlassian JIRA

View raw message