nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Grande <apere...@gmail.com>
Subject Re: Processor running slow in production, not locally
Date Thu, 06 Oct 2016 00:21:38 GMT
Just a sanity check, number of open file handles increased as per
quickstart document? Might need much more for your flow.

Another tip, when your server experiences undesired hiccups like that try
running 'nifi.sh dump save-in-this-file.txt' and investigate/share where
NiFi threads are being held back.

Andrew

On Tue, Oct 4, 2016, 10:54 AM Russell Bateman <
russell.bateman@perfectsearchcorp.com> wrote:

> We use the templating to create FHIR XML, in this case, a
>
>     <Binary>
>        ...
>        <content value="$flowfile_contents" />
>     </Binary>
>
> construct that includes a base-64 encoding of a PDF, the flowfile
> contents coming into the templating processor. These can get to be
> megabytes in size though our sample data was just under 1Mb.
>
> Yesterday, I built a new, reduced flow restricting the use of my
> /VelocityTemplating/ processor to perform only the part of that task
> that I suspected would be taking so much time, that is, copying the
> base-64 data into the template in place of the VTL macro. However, I
> could not reproduce the problem though I did this on the very production
> server (actually, more of a staging server, but it was the very server
> where the trouble was detected in the first place).
>
> Predictably (that is if, like me, you believe Murphy reigns supreme in
> this universe), the action using the very files in question took
> virtually no time at all, just as had been my experience running on my
> local development host. I then slightly expanded the new flow to take in
> some of the other trappings of the original one (but, it was the
> templating that was reported as being the bottleneck--minutes to fill
> out the template instead of milliseconds). In short, I could not
> replicate the problem. True, the moon is in a different phase than late
> last week when this was reported.
>
> I will come back here and report if and when we stumble upon this, it
> reoccurs and/or we took a decision about anything, for the benefit of
> the community. At present, we're looking to force re-ingestion of the
> run, using the original flow design, including the documents that
> reportedly experienced this trouble to see if it happens yet again.
>
> In the meantime, I can say:
>
>     - I keep no state in this processor (indeed, I try not to and don't
>     think I have anything stateful in any of our custom processors).
>     - The server runs some 40 cores, 128Gb RAM on 12Tb of disk,
>     dedicated hardware, CentOS 7, recently built and installed.
>     - Reportedly, I learned, little else was going on on the server at
>     the same time, either in NiFi or elsewhere.
>     - NiFi heap is configured to be 12Gb.
>     - Not so far along yet as to understand thread usage or garbage
>     collection state.
>
> Again, thanks for the suggestions from both of you.
>
> Russ
>
>
> On 10/03/2016 06:28 PM, Joe Witt wrote:
> > Russ,
> >
> > As Jeff points out lack of available threads could be a factor flow
> > slower processing times but this would manifest itself by you seeing
> > that the processor isn't running very often.  If it is that the
> > process itself when executing takes much longer than on the other box
> > then it is probably best to look at some other culprits.  To check
> > this out you can view the status history and look at the average
> > number of tasks and average task time for this process.  Does it look
> > right to you in terms of how often it runs, how long it takes, and is
> > the amount of time it takes growing?
> >
> > If you find that performance of this processor itself is slowing then
> > consider a few things.
> > 1) Does it maintain some internal state and if so is the data
> > structure it is using efficient for lookups?
> > 2) How does your heap look?  Is there a lot of garbage collection
> > activity?  Are there any full garbage collections and if so how often?
> >   It should generally be the case in a well configured and designed
> > system that full garbage collections never occur (ever).
> > 3) Attaching a remote debugger and/or running profilers on it can be
> > really illuminating.
> >
> > JOe
> >
> > On Mon, Oct 3, 2016 at 11:26 AM, Jeff <jtswork@gmail.com> wrote:
> >> Russel,
> >>
> >> This sounds like it's an environmental issue.  Are you able to see the
> heap
> >> usage on the production machine?  Are there enough available threads to
> get
> >> the throughput you are observing when you run locally?  Have you
> >> double-checked the scheduling tab on the processor config to make sure
> it
> >> is running as aggressively as it runs locally?
> >>
> >> I have run into this sort of thing before, and it was because of
> flowfile
> >> congestion in other areas of the flow, and there were no threads
> available
> >> for other processors to get through their own queues.
> >>
> >> Just trying to think through some of the obvious/high level things that
> >> might be affecting your flow...
> >>
> >> - Jeff
> >>
> >> On Mon, Oct 3, 2016 at 9:43 AM Russell Bateman <
> >> russell.bateman@perfectsearchcorp.com> wrote:
> >>
> >>> We use NiFi for an ETL feed. On one of the lines, we use a custom
> >>> processor, *VelocityTemplating* (calls Apache Velocity), which works
> very
> >>> well and indeed is imperceptibly fast when run locally on the same data
> >>> (template, VTL macros, substitution fodder). However, in production
> it's
> >>> another matter. What takes no time at all in local runs takes minutes
> in
> >>> that environment.
> >>>
> >>> I'm looking for suggestions as to a) why this might be and b) how best
> to
> >>> go about examining/debugging it. I think I will soon have
> remote-access to
> >>> the production machine (a VPN must be set up).
> >>>
> >>> Thanks,
> >>>
> >>> Russ
> >>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message