hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-15944) S3AInputStream logging to make it easier to debug file leakage
Date Mon, 19 Nov 2018 13:12:00 GMT
Steve Loughran created HADOOP-15944:

             Summary: S3AInputStream logging to make it easier to debug file leakage
                 Key: HADOOP-15944
                 URL: https://issues.apache.org/jira/browse/HADOOP-15944
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/s3
    Affects Versions: 3.1.1
            Reporter: Steve Loughran

Problem: if an app opens too many input streams, then all the http connections in the S3A
pool can be used up; all attempts to do other FS operations fail timing out for http pool

Proposed simple solution: log better what's going on with input stream lifecyce, specifically

# include URL of file in open, reopen & close events
# maybe: Separate logger for these events, though S3A Input stream should be enough as it
doesn't do much else.
# maybe: have some prefix in the events like "Lifecycle", so that you could use the existing
log @ debug, grep for that phrase and look at the printed URLs to identify what's going on
# stream metrics: expose some of the state of the http connection pool and/or active input
and output streams

Idle output streams don't use up http connections, as they only connect during block upload.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message