hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13028) add counter and timer metrics for S3A HTTP & low-level operations
Date Fri, 22 Apr 2016 18:29:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254401#comment-15254401

Steve Loughran commented on HADOOP-13028:

colin, about to push up my patch

# Nobody had told me of HDFS-10175, never mind
# I'm using the classic MetricsRegistry, with all the instrumentation lifted from Azure, made
the text/keys more generic, so the counters could be used for other object stores
# added a metrics to string builder, so the S3AFileSystem. toString() operation can just do
a complete dump of the stats. This is handy as it lets me print out the statistics of a run
even with code built against older Hadoop versions.
# Note that in the object stores, its not so much "per FS method" we're counting, but "per
object store API method". E.g. We're counting the number of copy operations in a rename; the
number of bytes copied remotely, the deletes that take place there, etc, etc.

Because this code is the usual metrics stuff, it slots in quite nicely to what there already
is. It does add one class to Hadoop common, MetricStringBuilder, which I've put there for
its generic usability. 

S3AFileSystem{uri=s3a://landsat-pds, workingDir=s3a://landsat-pds/user/stevel, partSize=104857600,
enableMultiObjectsDelete=true, multiPartThreshold=2147483647, serverSideEncryptionAlgorithm='null',
statistics {3843 bytes read, 0 bytes written, 2 read ops, 0 large read ops, 0 write ops},
metrics {{Context=S3AFileSystem} {FileSystemId=9042fe44-6438-4cc5-b3bf-d594dc71e699} {streamOpened=7}
{streamCloseOperations=6} {streamClosed=1} {streamAborted=5} {streamSeekOperations=5} {readExceptions=0}
{forwardSeekOperations=3} {backwardSeekOperations=2} {bytesSkippedOnSeek=767} {files_created=0}
{files_copied=0} {files_copied_bytes=0} {files_deleted=0} {directories_created=0} {directories_deleted=0}
{ignored_errors=0} }}

> add counter and timer metrics for S3A HTTP & low-level operations
> -----------------------------------------------------------------
>                 Key: HADOOP-13028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13028
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, metrics
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
> against S3 (and other object stores), opening connections can be expensive, closing connections
may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of open/close/failure+reconnect
operations, timers of how long things take. This can be used downstream to measure efficiency
of the code (how often connections are being made), connection reliability, etc.

This message was sent by Atlassian JIRA

View raw message