hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-6110) adding more slow action log in critical write path
Date Fri, 25 Apr 2014 20:08:23 GMT

     [ https://issues.apache.org/jira/browse/HDFS-6110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HDFS-6110:
------------------------

    Attachment: HDFS-6110v6.txt

[~xieliang007] 's latest patch adding in offline review feedback I got from our Todd (See
below): i.e. having one threshold for dfsclient (a higher one so folks MR'ing don't get annoyed
by all the WARNings about slow i/o), and then another for datanode side which is much lower
so we can see bad i/os.

{code}
16:38 < todd> stack: just looked at 6110. had one more thought after commenting on the
JIRA
16:38 < todd> you think we should add a separate config for client vs server?
16:38 < todd> I'm afraid that the 300ms default may be a little aggressive for the client
- people using hadoop fs -put to upload files may get kind of nervous the next time they upgrade
if they start
              seeing warnings
16:38 < todd> MR jobs too
16:39 < todd> may be better to have the client default be 10sec or something really
long, and then HBase could tune it down for WAL files
16:39 < stack> todd: thanks boss
16:39 < todd> you think i'm crazy?
16:39 < stack> no
16:39 < stack> Testing it, it is "illuminating" to see how long stuff takes
16:39 < todd> k. yea
16:39 < todd> I had a patch like that once on the server side
16:39 < stack> Was worried though that it'd freak folks out.
16:40 < stack> Or, rather, they'd ignore what is being said and just consider it 'noise'.
16:40 < todd> yea
16:40 < todd> for a throughput app it is kind of noise
16:40 < todd> but hbase could definitely tune the default inside the RS down
16:40 < stack> Let me do as you suggest.
16:40 < todd> k
16:40 < stack> Thanks for review.
16:40 < todd> feel free to paste this convo into the jira so it makes sense :)
16:40 < todd> didn't want to post yet another comment and pollute everyone's mailboxes
16:41  * stack nod
{code}

> adding more slow action log in critical write path
> --------------------------------------------------
>
>                 Key: HDFS-6110
>                 URL: https://issues.apache.org/jira/browse/HDFS-6110
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HDFS-6110-v2.txt, HDFS-6110.txt, HDFS-6110v3.txt, HDFS-6110v4.txt,
HDFS-6110v5.txt, HDFS-6110v6.txt
>
>
> After digging a HBase write spike issue caused by slow buffer io in our cluster, just
realize we'd better to add more abnormal latency warning log in write flow, such that if other
guys hit HLog sync spike, we could know more detail info from HDFS side at the same time.
> Patch will be uploaded soon.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message