hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14973) Log StorageStatistics
Date Tue, 24 Oct 2017 10:19:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216667#comment-16216667
] 

Steve Loughran commented on HADOOP-14973:
-----------------------------------------

+output from S3AInputStream.toString, again, logged in scala after some spark work
{code}
- Duration of readFully(260, byte[256]) [pos = 45603306] = 341,500,815 nS
2017-10-10 16:17:54,090 [ScalaTest-main-running-S3ASeekReadSuite] INFO  s3.S3ASeekReadSuite
(Logging.scala:logInfo(54)) -   S3AInputStream{s3a://landsat-pds/scene_list.gz wrappedStream=open
read policy=sequential pos=516 nextReadPos=45603306 contentLength=45603307 contentRangeStart=260
contentRangeFinish=45603307 remainingInCurrentRequest=45602791
StreamStatistics{OpenOperations=6, CloseOperations=5, Closed=2, Aborted=3, SeekOperations=3,
ReadExceptions=0, ForwardSeekOperations=0, BackwardSeekOperations=3, BytesSkippedOnSeek=0,
BytesBackwardsOnSeek=91206303, BytesRead=815, BytesRead excluding skipped=815, ReadOperations=4,
ReadFullyOperations=4, ReadsIncomplete=0, BytesReadInClose=51, BytesDiscardedInAbort=136809661}}
{code}
That log shows  abort ops; this test uses the committer.

And here's the stats as collected from a localhost run of some ORC dataframe IO against an
s3 bucket in spark. 
{code}
INFO  s3.S3AOperations (Logging.scala:logInfo(54)) - Metrics:
  S3guard_metadatastore_put_path_latency50thPercentileLatency = 1285
  S3guard_metadatastore_put_path_latency75thPercentileLatency = 1332
  S3guard_metadatastore_put_path_latency90thPercentileLatency = 1332
  S3guard_metadatastore_put_path_latency95thPercentileLatency = 1332
  S3guard_metadatastore_put_path_latency99thPercentileLatency = 1332
  S3guard_metadatastore_put_path_latencyNumOps = 3
  S3guard_metadatastore_throttle_rate50thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rate75thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rate90thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rate95thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rate99thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rateNumEvents = 0
  committer_bytes_committed = 34833950
  committer_bytes_uploaded = 34833950
  committer_commits_aborted = 0
  committer_commits_completed = 45
  committer_commits_created = 43
  committer_commits_failed = 0
  committer_commits_reverted = 0
  committer_jobs_completed = 17
  committer_jobs_failed = 0
  committer_tasks_completed = 21
  committer_tasks_failed = 0
  directories_created = 31
  directories_deleted = 0
  fake_directories_deleted = 1127
  files_copied = 11
  files_copied_bytes = 128519807
  files_created = 46
  files_deleted = 34
  ignored_errors = 100
  object_continue_list_requests = 0
  object_copy_requests = 0
  object_delete_requests = 174
  object_list_requests = 514
  object_metadata_requests = 1081
  object_multipart_aborted = 0
  object_put_bytes = 163448561
  object_put_bytes_pending = 0
  object_put_requests = 142
  object_put_requests_active = 0
  object_put_requests_completed = 142
  op_copy_from_local_file = 0
  op_exists = 75
  op_get_file_status = 595
  op_glob_status = 12
  op_is_directory = 0
  op_is_file = 0
  op_list_files = 10
  op_list_located_status = 8
  op_list_status = 63
  op_mkdirs = 17
  op_rename = 11
  s3guard_metadatastore_initialization = 0
  s3guard_metadatastore_put_path_request = 380
  s3guard_metadatastore_retry = 0
  s3guard_metadatastore_throttled = 0
  store_io_throttled = 0
  stream_aborted = 0
  stream_backward_seek_operations = 20
  stream_bytes_backwards_on_seek = 23110
  stream_bytes_discarded_in_abort = 0
  stream_bytes_read = 68258611
  stream_bytes_read_in_close = 4417
  stream_bytes_skipped_on_seek = 0
  stream_close_operations = 95
  stream_closed = 95
  stream_forward_seek_operations = 0
  stream_opened = 95
  stream_read_exceptions = 0
  stream_read_fully_operations = 0
  stream_read_operations = 16774
  stream_read_operations_incomplete = 7828
  stream_seek_operations = 20
  stream_write_block_uploads = 18
  stream_write_block_uploads_aborted = 0
  stream_write_block_uploads_active = 0
  stream_write_block_uploads_committed = 0
  stream_write_block_uploads_data_pending = 0
  stream_write_block_uploads_pending = 42
  stream_write_failures = 0
  stream_write_total_data = 128624034
  stream_write_total_time = 354315
{code}
This is the committer code, so it tracks that and throttling stats. Throttle is interesting
as it's not just per-thread, it's per-all-clients of a shard in a bucket. At least collecting
on a per-query basis will let you know that the reason something is slow is that the job was
throttled (so: fix that, tune backoff or reduce #of workers)


> Log StorageStatistics
> ---------------------
>
>                 Key: HADOOP-14973
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14973
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>
> S3A is currently storing much more detailed metrics via StorageStatistics than are logged
in a MapReduce job. Eventually, it would be nice to get Spark, MapReduce and other workloads
to retrieve and store these metrics, but it may be some time before they all do that. I'd
like to consider having S3A publish the metrics itself in some form. This is tricky, as S3A
has no daemon but lives inside various other processes.
> Perhaps writing to a log file at some configurable interval and on close() would be the
best we could do. Other ideas would be welcome.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message