camel-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hoffer <dhoff...@gmail.com>
Subject Re: Camel ThreadPool maxQueueSize question
Date Sun, 22 Nov 2015 02:29:35 GMT
I'm not sure how to block the polling.

Here is what seems like an ideal approach...the SFTP polling always runs on
schedule and downloads files with single thread to a folder.  This won't
use much memory as its just copying one file at a time to the folder.  Then
I'd have X threads take those files and start the decrypting/processing.
Since this part uses a lot of memory it seems I'd want to limit the number
of threads that can do this task so the max memory is contained.

However I don't know how to do this as I'm new to Camel.

Yes I'd really like to use streaming instead of byte[] at every step of the
processing but no idea if that's possible in my use case.  Sounds like it
worked in yours.

-Dave

On Sat, Nov 21, 2015 at 10:22 AM, mailinglist@j-b-s.de <mailinglist@j-b-s.de
> wrote:

> I guess you need to block the polling while you process files in parallel.
> A seda queue with a capacity limit will at least block the consumer. As I
> do not know what exactly you are doing with the files, if always the same
> amount of mem per file is required it's hard to tell what mem settings to
> use. Always providing more mem is not a solution from my point of view,
> because you hit the same limit just later.
>
> Limiting messages, use of streaming / splitting will keep mem usage low
> (at least in our env it works that way and we reduced mem usage from 1G to
> 128M per VM). But if this may something for you...don't know
>
>
> Jens
>
> Von meinem iPhone gesendet
>
> > Am 21.11.2015 um 16:40 schrieb David Hoffer <dhoffer6@gmail.com>:
> >
> > Yes when the sftp read thread stops it was still processing files it had
> > previously downloaded.  And since we can get so many files on each poll
> > (~1000) and we have to do a lot of decrypting of these files in
> subsequent
> > routes that its possible that the processing of the 1000 files is not
> done
> > before the next poll where we get another 1000 files.  Eventually the
> SFTP
> > endpoint will have less/no files and the rest of the routes can catch up.
> > All the rest of the routes are file based (except the very last) so there
> > is no harm if intermediate folders get backed up with files.
> >
> > We only have one SFTP connection for reading in this case.
> >
> > Do you think the seda approach is right for this case?  I can look into
> > it.  Note my previous post that in my dev environment the reason it
> stopped
> > was out of memory error...i doubt that is the same case in production as
> > the rest of the routes do not stop.
> >
> > -Dave
> >
> > On Sat, Nov 21, 2015 at 1:36 AM, mailinglist@j-b-s.de <
> mailinglist@j-b-s.de>
> > wrote:
> >
> >> Hi!
> >>
> >> when your sftp read threads stopps the files are still in process? In
> our
> >> env we had something similar in conjunction with splitting large files
> >> because the initial message is pending until all processing is
> completed.
> >> We solved it using a seda queue (limited in size) in betweeen our sfpt
> >> consumer and processing route and "parallel" execution.
> >>
> >> one sftp consumer -> seda  (size limit) -> processing route (with dsl
> >> parallel)
> >>
> >> and this works without any problems.
> >>
> >> Maybe you have to many sftp connections? Maybe its entirely independent
> >> from camel and you reached a file handle limit?
> >>
> >> Jens
> >>
> >>
> >> Von meinem iPhone gesendet
> >>
> >>> Am 20.11.2015 um 23:09 schrieb David Hoffer <dhoffer6@gmail.com>:
> >>>
> >>> This part I'm not clear on and it raises more questions.
> >>>
> >>> When using the JDK one generally uses the Executors factory methods to
> >>> create either a Fixed, Single or Cached thread tool.  These will use a
> >>> SynchronousQueue for Cached pools and LinkedBlockingQueue for Fixed or
> >>> Single pools.  In the case of SynchronousQueue there is no size...it
> >> simply
> >>> hands the new request off to either a thread in the pool or it creates
> a
> >>> new one.  And in the case of LinkedBlockingQueue it uses an unbounded
> >> queue
> >>> size.  Now it is possible to create a hybrid, e.g. LinkedBlockingQueue
> >> with
> >>> a max size but its not part of the factory methods or common.  Another
> >>> option is the ArrayBlockingQueue which does use a max size but none of
> >> the
> >>> factory methods use this type.
> >>>
> >>> So what type of thread pool does Camel create for the default thread
> >> pool?
> >>> Since its not fixed size I assumed it would use SynchronousQueue and
> not
> >>> have a separate worker queue.  However if Camel is creating a hybrid
> >> using
> >>> a LinkedBlockingQueue or ArrayBlockingQueue is there a way I can change
> >>> that to be a SynchronousQueue so no queue?  Or is there a compelling
> >> reason
> >>> to use LinkedBlockingQueue in a cached pool?
> >>>
> >>> Now this gets to the problem I am trying to solve.  We have a Camel app
> >>> that deals with files, lots of them...e.g. all the routes deal with
> >> files.
> >>> It starts with an sftp URL that gets files off a remote server and then
> >>> does a lot of subsequent file processing.  The problem is that if the
> >> SFTP
> >>> server has 55 files (example) and I start the Camel app it processes
> them
> >>> fine until about 14 or 15 files are left and then it just stops.  The
> >>> thread that does the polling of the server stops (at least it appears
> to
> >>> have stopped) and the processing of the 55 files stops, e.g. it does
> not
> >>> continue to process all of the original 55 files, it stops with 14-15
> >> left
> >>> to process (and it never picks them up again on the next poll).  And I
> >> have
> >>> a breakpoint on my custom SftpChangedExclusiveReadLockStrategy and it
> >> never
> >>> is called again.
> >>>
> >>> Now getting back to the default thread pool and changing it I would
> like
> >> to
> >>> change it so it uses more threads and no worker queue (like a standard
> >>> Executors cached thread pool) but I'm not certain that would even help
> as
> >>> in the debugger & thread dumps I see that it looks like the SFTP
> endpoint
> >>> uses a Scheduled Thread Pool instead which makes sense since its a
> >> polling
> >>> (every 60 seconds in my case) operation.  So is there another default
> >> pool
> >>> that I can configure for Camel's scheduled threads?
> >>>
> >>> All that being said why would the SFTP endpoint just quit?  I don't see
> >> any
> >>> blocked threads and no deadlock.  I'm new to Camel and just don't know
> >>> where to look for possible causes of this.
> >>>
> >>> Thanks,
> >>> -Dave
> >>>
> >>>
> >>>> On Thu, Nov 19, 2015 at 11:40 PM, Claus Ibsen <claus.ibsen@gmail.com>
> >> wrote:
> >>>>
> >>>> Yes its part of JDK as it specifies the size of the worker queue, of
> >>>> the thread pool (ThreadPoolExecutor)
> >>>>
> >>>> For more docs see
> >>>> http://camel.apache.org/threading-model.html
> >>>>
> >>>> Or the Camel in Action books
> >>>>
> >>>>
> >>>>> On Fri, Nov 20, 2015 at 12:22 AM, David Hoffer <dhoffer6@gmail.com>
> >> wrote:
> >>>>> I'm trying to understand the default Camel Thread Pool and how the
> >>>>> maxQueueSize is used, or more precisely what's it for?
> >>>>>
> >>>>> I can't find any documentation on what this really is or how it's
> used.
> >>>> I
> >>>>> understand all the other parameters as they match what I'd expect
> from
> >>>> the
> >>>>> JDK...poolSize is the minimum threads to keep in the pool for new
> tasks
> >>>> and
> >>>>> maxPoolSize is the maximum number of the same.
> >>>>>
> >>>>> So how does maxQueueSize fit into this?  This isn't part of the
JDK
> >>>> thread
> >>>>> pool so I don't know how Camel uses this.
> >>>>>
> >>>>> The context of my question is that we have a from sftp route that
> seems
> >>>> to
> >>>>> be getting thread starved.  E.g. the thread that polls the sftp
> >>>> connection
> >>>>> is slowing/stopping at times when it is busy processing other files
> >> that
> >>>>> were previously downloaded.
> >>>>>
> >>>>> We are using the default camel thread pool that I see has only a
max
> of
> >>>> 20
> >>>>> threads yet a maxQueueSize of 1000.  That doesn't make any sense
to
> me
> >>>>> yet.  I would think one would want a much larger pool of threads
(as
> we
> >>>> are
> >>>>> processing lots of files) but no queue at all...but not sure on
that
> >> as I
> >>>>> don't understand how the queue is used.
> >>>>>
> >>>>> -Dave
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Claus Ibsen
> >>>> -----------------
> >>>> http://davsclaus.com @davsclaus
> >>>> Camel in Action 2: https://www.manning.com/ibsen2
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message