hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [hadoop] steveloughran opened a new pull request #2294: HADOOP-17195. ABFS Store thread pool for stream IO.
Date Wed, 09 Sep 2020 16:19:36 GMT

steveloughran opened a new pull request #2294:
URL: https://github.com/apache/hadoop/pull/2294

   This is the successor to #2179
   1. ABFS Store creates a single threadpool, configurable with fixed size or multiple of
   1. each output stream is given its own semaphored pool which limits the access that stream
has to the pool
   To actually defend against OOMs the per-stream queue length is what needs to be managed;
looking at the patch it still has the problem of #2179: you need one buffer per pending upload
in the the pools.
   Ultimately the S3A Connector fixed this by going to disk buffering by default. A more performant
design might be to have a blocking byte buffer factory which limits the #of buffers which
the streams can request, so putting an upper bound on the amount of memory which a single
ABFS store instance can demand. 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message