camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "MykhailoVlakh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CAMEL-11697) S3 Consumer: If maxMessagesPerPoll is greater than 50 consumer fails to poll objects from bucket
Date Wed, 23 Aug 2017 14:00:03 GMT

    [ https://issues.apache.org/jira/browse/CAMEL-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138386#comment-16138386
] 

MykhailoVlakh commented on CAMEL-11697:
---------------------------------------

[~ancosen] I suggested to add a little bit more connections to the pool to make sure that
S3 consumer is able to perform some other S3 API calls when it already holds all the maxMessagePerPoll
s3 objects opened. Otherwise all additional API call will fail since the pool will be empty
at that point. If no additional API calls are expected then it is perfectly fine to have exacrt
maxMessagePerPoll connections in the pool.

Yes, I saw your proposition to contribute. If I find some free time I will take a look at
the code once again and try to prepare a fix suggestion as a patch. But it would be nice to
understand if there is any real reason to open all the objects before sending exchanges for
processing. Because if this was intentional then there is no point to try change this code
without knowing about that. 

Thank you!

> S3 Consumer: If maxMessagesPerPoll is greater than 50 consumer fails to poll objects
from bucket
> ------------------------------------------------------------------------------------------------
>
>                 Key: CAMEL-11697
>                 URL: https://issues.apache.org/jira/browse/CAMEL-11697
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-aws
>    Affects Versions: 2.14.3, 2.19.2
>            Reporter: MykhailoVlakh
>            Assignee: Andrea Cosentino
>             Fix For: 2.20.0
>
>
> It is possible to configure S3 consumer to process several s3 objects in a single poll
using the maxMessagesPerPoll property. 
> If this property contains a small number, less than 50, everything works fine but if
user tries to consume more files then s3 consumer simply fails all the time. It cannot poll
files because there are not enough HTTP connections to open streams for all the requested
files at once. The exception looks like this:
> {code}
> com.amazonaws.AmazonClientException: Unable to execute HTTP request: Timeout waiting
for connection from pool
> 	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:544)
> 	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:273)
> 	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3660)
> 	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1133)
> 	at com.amazonaws.services.s3.AmazonS3EncryptionClient.access$201(AmazonS3EncryptionClient.java:65)
> 	at com.amazonaws.services.s3.AmazonS3EncryptionClient$S3DirectImpl.getObject(AmazonS3EncryptionClient.java:524)
> 	at com.amazonaws.services.s3.internal.crypto.S3CryptoModuleAE.getObjectSecurely(S3CryptoModuleAE.java:106)
> 	at com.amazonaws.services.s3.internal.crypto.CryptoModuleDispatcher.getObjectSecurely(CryptoModuleDispatcher.java:114)
> 	at com.amazonaws.services.s3.AmazonS3EncryptionClient.getObject(AmazonS3EncryptionClient.java:427)
> 	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1005)
> 	at org.apache.camel.component.aws.s3.S3Consumer.createExchanges(S3Consumer.java:112)
> 	at org.apache.camel.component.aws.s3.S3Consumer.poll(S3Consumer.java:93)
> 	at org.apache.camel.impl.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:187)
> 	at org.apache.camel.impl.ScheduledPollConsumer.run(ScheduledPollConsumer.java:114)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> The issue happens because by default AmazonS3Client uses HTTP client with limited number
of connections in the pool - 50. 
> Since S3 consumer provides a possibility to consume any number of s3 objects in a single
pool and because it is quite common case that someone needs to process 50 or more files in
a single pool I think s3 consumer should handle this case properly. It should automatically
change HTTP connections pool size to be able to handle requested number of objects. This can
be done like this:
> {code}
> ClientConfiguration s3Config = new ClientConfiguration();
> /*
> +20 we need to allocate a bit more to be sure that we always can do additional API calls
when we already hold maxMessagesPerPoll s3 object streams opened
> */
> s3Config.setMaxConnections(maxMessagesPerPoll + 20); 
> AmazonS3Client client = new AeAmazonS3Client(awsCreds, s3Config );
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message