hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15349) S3Guard DDB retryBackoff to be more informative on limits exceeded
Date Wed, 28 Mar 2018 11:53:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417229#comment-16417229
] 

Steve Loughran commented on HADOOP-15349:
-----------------------------------------

+metrics. The S3Guard retries are not being added to stats. The bucket had a DDB table with
read=write=10 units when it overloaded


{code}
2018-03-28 04:29:34,295 [ScalaTest-main-running-S3ACommitBulkDataSuite] INFO  s3.S3AOperations
(Logging.scala:logInfo(54)) - Metrics:
  S3guard_metadatastore_put_path_latency50thPercentileLatency = 0
  S3guard_metadatastore_put_path_latency75thPercentileLatency = 0
  S3guard_metadatastore_put_path_latency90thPercentileLatency = 0
  S3guard_metadatastore_put_path_latency95thPercentileLatency = 0
  S3guard_metadatastore_put_path_latency99thPercentileLatency = 0
  S3guard_metadatastore_put_path_latencyNumOps = 0
  S3guard_metadatastore_throttle_rate50thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rate75thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rate90thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rate95thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rate99thPercentileFrequency (Hz) = 0
  S3guard_metadatastore_throttle_rateNumEvents = 0
  committer_bytes_committed = 12594213
  committer_bytes_uploaded = 12594213
  committer_commits_aborted = 0
  committer_commits_completed = 138
  committer_commits_created = 136
  committer_commits_failed = 0
  committer_commits_reverted = 0
  committer_jobs_completed = 29
  committer_jobs_failed = 0
  committer_magic_files_created = 2
  committer_tasks_completed = 34
  committer_tasks_failed = 0
  directories_created = 25
  directories_deleted = 0
  fake_directories_deleted = 1788
  files_copied = 6
  files_copied_bytes = 8473
  files_created = 39
  files_deleted = 28
  ignored_errors = 92
  object_continue_list_requests = 0
  object_copy_requests = 0
  object_delete_requests = 227
  object_list_requests = 431
  object_metadata_requests = 592
  object_multipart_aborted = 0
  object_put_bytes = 12738607
  object_put_bytes_pending = 0
  object_put_requests = 204
  object_put_requests_active = 0
  object_put_requests_completed = 204
  op_copy_from_local_file = 0
  op_create = 39
  op_create_non_recursive = 0
  op_delete = 77
  op_exists = 177
  op_get_file_checksum = 0
  op_get_file_status = 1232
  op_glob_status = 8
  op_is_directory = 22
  op_is_file = 0
  op_list_files = 13
  op_list_located_status = 6
  op_list_status = 110
  op_mkdirs = 10
  op_open = 407
  op_rename = 6
  s3guard_metadatastore_initialization = 1
  s3guard_metadatastore_put_path_request = 203
  s3guard_metadatastore_retry = 0
  s3guard_metadatastore_throttled = 0
  store_io_throttled = 0
  stream_aborted = 0
  stream_backward_seek_operations = 145
  stream_bytes_backwards_on_seek = 8272842
  stream_bytes_discarded_in_abort = 0
  stream_bytes_read = 9199577
  stream_bytes_read_in_close = 783876
  stream_bytes_skipped_on_seek = 187038
  stream_close_operations = 549
  stream_closed = 549
  stream_forward_seek_operations = 72
  stream_opened = 549
  stream_read_exceptions = 0
  stream_read_fully_operations = 448
  stream_read_operations = 13129
  stream_read_operations_incomplete = 932
  stream_seek_operations = 217
  stream_write_block_uploads = 2
  stream_write_block_uploads_aborted = 0
  stream_write_block_uploads_active = 0
  stream_write_block_uploads_committed = 0
  stream_write_block_uploads_data_pending = 0
  stream_write_block_uploads_pending = 37
  stream_write_failures = 0
  stream_write_total_data = 153642
  stream_write_total_time = 738
{code}

(ps: suspect that uploads_pending is a false stat & its really the uncommitted uploads
being counted)

> S3Guard DDB retryBackoff to be more informative on limits exceeded
> ------------------------------------------------------------------
>
>                 Key: HADOOP-15349
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15349
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.1.0
>            Reporter: Steve Loughran
>            Priority: Major
>         Attachments: failure.log
>
>
> When S3Guard can't update the DB and so throws an IOE after the retry limit is exceeded,
it's not at all informative. Improve logging & exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message