nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Payne <marka...@hotmail.com>
Subject Re: Wait from multiple inputs before ending the flow
Date Sun, 13 Dec 2015 14:19:12 GMT
Louis-Etienne,

I see that you mentioned that you are happy about the Syslog feature in 0.4.0 - are you actually
using
0.4.0 for this dataflow? In 0.4.0, the InvokeHTTP should be penalizing any FlowFile that is
routed to the
'failure' relationship. Penalization is the term that we use in NiFi to essentially "sleep"
a FlowFile, as you
were describing. The amount of time to have it sleep is determined by the Settings tab on
the Processor
that penalizes the FlowFile. So for your flow, InvokeHTTP is penalizing it, so in the Settings
tab of InvokeHTTP
there is a Penalty Duration that you can configure. The default is 30 seconds. However, it's
quite possible
that versions prior to 0.4.0 do not penalize when routing to failure for InvokeHTTP.

Thanks
-Mark


> On Dec 12, 2015, at 2:19 PM, Louis-√Čtienne Dorval <ledor473@gmail.com> wrote:
> 
> Juan,
> 
> Very good advice! 
> I've updated my loop to add a processor that will run each X seconds when a retry is
needed, that way it won't block the other request :
> 
> InvokeHttp --(failure)--> UpdateAttribute (increment a counter) --> RouteAttribute
(if counter lower than X will retry) --(if retriable)--> UpdateAttribute (does nothing
but scheduled to run each X sec) --> InvokeHttp (same as the first)
> 
> Thank you
> Louis-Etienne
> 
> On 12 December 2015 at 14:07, Juan Sequeiros <hellojuan@gmail.com <mailto:hellojuan@gmail.com>>
wrote:
> Good afternoon,
> 
> All processors have a scheduling tab or run every X time. I would set it at your last
invokeHttp.
> 
> If time is important look at the prioritizer settings too. So that you can send through
as example a more important flowFile.
> 
> On Dec 12, 2015 13:48, "Louis-√Čtienne Dorval" <ledor473@gmail.com <mailto:ledor473@gmail.com>>
wrote:
> Hi again,
> 
> 
> The MergeContent works perfectly for my case! The flow I've described in the previous
email changed a bit, but still it's working as expected.
> 
> The only problem, now NiFi is much faster than the existing system running in parallel
(which is really good). I've done a "retry loop" describe below but still it's too fast :
> InvokeHttp -- (failure) --> UpdateAttribute (increment a counter) --> RouteAtttribute
(if lower than X will retry) --> InvokeHttp
> 
> Question: Is there something that already exist which could "sleep" a FlowFile for X
seconds before continuing?
> 
> 
> Best regards and great job on the version 0.4.0, the Syslog feature is much appreciated!
> Louis-Etienne
> 
> PS: Let me know if I should have started a new email thread with that question. 
> 
> 
> On 6 December 2015 at 23:30, Joe Witt <joe.witt@gmail.com <mailto:joe.witt@gmail.com>>
wrote:
> Louis-Etienne,
> 
> NIFI-190 isn't scheduled on anything as of yet.  We had some design
> questions/ideas and your example informs it even further.
> 
> I think the custom proc method you mention will work out well.
> Ultimately there will need to be one anyway to deal with the logic of
> merging this particular format+schema.
> 
> Thanks
> Joe
> 
> On Sun, Dec 6, 2015 at 11:28 PM,  <ledor473@gmail.com <mailto:ledor473@gmail.com>>
wrote:
> > Joe,
> >
> > Thanks for the prompt reply.
> > About the merge, both message will be JSON and I need some specific part from both.
> > I'll recheck the doc to see what my options are, but I think that using FlowFile
Streams and a custom processor that would do the logic might be good
> >
> > About the HoldProcessor, you must talk about NIFI-190. The way you describe it seems
to what I'm looking for
> > But in the JIRA and looking quickly at the PR it seems like I would lose the message
from Topic2.
> >
> > I'll dig in the code of the PR and the MergeContent processor in order to have a
better understanding.
> >
> > Was that JIRA scheduled for a specific milestone? It would probably be a great addition
but maybe it require a lot of change that I dont see yet
> >
> > Regards,
> > Louis-Etienne
> >
> >> On 6, 2015, at 9:42 PM, Joe Witt <joe.witt@gmail.com <mailto:joe.witt@gmail.com>>
wrote:
> >>
> >> Louis-Etienne,
> >>
> >> My initial thought is your idea with MergeContent is the right one.
> >> However, the issue there is not just the combining of the data but the
> >> 'what does merging truly mean in that case'.  So it is a bit undefined
> >> what the next step will be.  Merge the content?  If so, how?  What is
> >> the format and schema of the objects before the merge and after?
> >>
> >> Another member of the community had an idea for a concept of a
> >> HoldProcessor.  It would allow these sorts of multi-object gates to
> >> occur.  The same issue exists of what to do once the gate criteria is
> >> hit but at that point you'd have more control over it.  MergeContent
> >> is an already prescribed set of behaviors whereas HoldContent would
> >> let you choose the next gate.  We really should get on with helping
> >> get that contribution in.
> >>
> >> Thanks
> >> Joe
> >>
> >>> On Sun, Dec 6, 2015 at 9:35 PM, Louis-√Čtienne Dorval <ledor473@gmail.com
<mailto:ledor473@gmail.com>> wrote:
> >>> Hi everyone!
> >>>
> >>> I'm very excited to start using NiFi and I think that it will be very
> >>> usefull for a some projects.
> >>>
> >>> I've been playing with it for some times and did a few basic flow, but I'm
> >>> having a hard time figuring how to achieve a part of my flow or if NiFi
will
> >>> be able to do it.
> >>> I'm building a flow around existing systems, so NiFi would run in parallel
> >>> of that and gather the output of these systems (everything is asynchronous)
> >>> to take actions.
> >>>
> >>> Everything starts with a GetJMSTopic on Topic1, then follows 2-3 processor
> >>> that does Attribute Extractions.
> >>> During that time the existing system will process the same message, enrich
> >>> the message (but also remove some usefull information) and will publish
on
> >>> Topic2.
> >>> I need the message from Topic2, so I've added another GetJMSTopic on Topic2.
> >>> Then I need to somehow take the FlowFile from Topic1 and from Topic2,
> >>> "merge" them together in order to have the attributes from both FlowFiles.
> >>> After that I will probably need to use the GetMongo to access some
> >>> information. This will probably create a new FlowFile that I need to "merge"
> >>> with the others.
> >>> Then I'll put that in HBase or something else, not sure yet.
> >>>
> >>> The part that I'm not sure how to solve is the "merge" of multiples
> >>> FlowFile, I hesitate between using the MergeContent processor and the
> >>> DetectDuplicate:
> >>>
> >>> MergeContent seems like what I needs but the existing systems might add
some
> >>> latency (and it will increase when there's a lot of publish on Topic1) so
I
> >>> would need to increase the 'Maximum number of Bins'.
> >>> It will probably affect the performance of the system but how bad?
> >>> DetectDuplicate, it would feel akward to use that since it's not really
a
> >>> duplicate, but it would be more lightweight (only keeps a hash). But will
I
> >>> be able to find the previous FlowFile with "original.flowfile.description"
?
> >>>
> >>>
> >>> Let me know if there's another option that I didn't look into.
> >>> Or maybe my problem is really trivial but I need to change my perspective
on
> >>> it...
> >>>
> >>> Best regards,
> >>> Louis-Etienne
> 
> 


Mime
View raw message