hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13139) Branch-2: S3a to use thread pool that blocks clients
Date Mon, 21 Aug 2017 16:12:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16135370#comment-16135370

Jason Lowe commented on HADOOP-13139:

Had a user that ran into this on one of our clusters that upgraded to 2.8.  They were running
a pre-2.8 version of the S3AFileSystem code with their job and it failed like this:
	at java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1307)
	at java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1230)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:280)
	at com.yahoo.prism.UseLocalKeyS3AFileSystem.initializeFileSystem(UseLocalKeyS3AFileSystem.java:68)
	at com.yahoo.prism.UseLocalKeyS3AFileSystem.initialize(UseLocalKeyS3AFileSystem.java:113)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2670)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:95)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2704)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2686)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:374)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)

The problem is that core-default in 2.8 removed fs.s3a.threads.core but changed the existing
fs.s3a.threads.max to 10.  The old pre-2.8 S3AFileSystem code had code defaults of 15 and
256, respectively.  So when a 2.8 job client (in this case an Oozie server) submits the job,
picking up the 2.8 core-default settings for fs.s3a.threads.max for job.xml but the job itself
runs with the older S3AFileSystem code the job fails because it tries to initialize a threadpool
with core threads=15 and max threads=10.

Not sure if this is considered simply an invalid setup, but I suspect this won't be the first
case of someone submitting a job with a 2.8 or later client (e.g.: via an Oozie server upgraded
independently of a user's job code) and failing because the user hasn't upgraded to the 2.8
or later S3AFileSystem code yet.

If we had added a deprecated core-default value for fs.s3a.threads.core then the older code
would have gotten consistent values for core and max threads.  As it is now, it gets half
of the new default settings, and those aren't compatible with the older, other half of the
defaults.  Thoughts on whether this is worth doing in a followup JIRA?

> Branch-2: S3a to use thread pool that blocks clients
> ----------------------------------------------------
>                 Key: HADOOP-13139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13139
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Pieter Reuse
>            Assignee: Pieter Reuse
>             Fix For: 2.8.0
>         Attachments: HADOOP-13139-001.patch, HADOOP-13139-branch-2.001.patch, HADOOP-13139-branch-2.002.patch,
HADOOP-13139-branch-2-003.patch, HADOOP-13139-branch-2-004.patch, HADOOP-13139-branch-2-005.patch,
> HADOOP-11684 is accepted into trunk, but was not applied to branch-2. I will attach a
patch applicable to branch-2.
> It should be noted in CHANGES-2.8.0.txt that the config parameter 'fs.s3a.threads.core'
has been been removed and the behavior of the ThreadPool for s3a has been changed.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message