hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8088) Reduce the number of HTrace spans generated by HDFS reads
Date Tue, 14 Apr 2015 20:11:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14494783#comment-14494783

Josh Elser commented on HDFS-8088:

bq. Keep in mind that doing a write in HDFS just hands the data off to a background thread
called DataStreamer. which writes it out asynchronously

Ahh, good point. I didn't fully connect the dots in my head when I was initially guessing.

bq. I'm inclined to lean more towards goal #1 (figure out why specific requests had high latency)
than goal #2


bq.  I do think maybe we should target 2.7.1 for some of these changes since I need to think
through everything

I would be very happy to see this (as well as HDFS-8026) land into 2.7.1.

bq.  I'd also like to run some patches by you guys to see if it improves the usefulness of
HTrace to you.

Happy to do so. I'm sure [~billie.rinaldi], as well as some other Accumulo folks, would be

bq. also I am at a conference now, so I apologize if my replies are slow!

No worries! Assuming you're at ApacheCon (and presumably speaking?), I hope it goes well.
Enjoy, and we can catch up when you're on a normal schedule again.

> Reduce the number of HTrace spans generated by HDFS reads
> ---------------------------------------------------------
>                 Key: HDFS-8088
>                 URL: https://issues.apache.org/jira/browse/HDFS-8088
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-8088.001.patch
> HDFS generates too many trace spans on read right now.  Every call to read() we make
generates its own span, which is not very practical for things like HBase or Accumulo that
do many such reads as part of a single operation.  Instead of tracing every call to read(),
we should only trace the cases where we refill the buffer inside a BlockReader.

This message was sent by Atlassian JIRA

View raw message