nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Ulicny <euli...@umich.edu>
Subject Re: Nifi Parallel Execution
Date Thu, 12 Apr 2018 13:58:27 GMT
We have attempted to use the distributed map cache with the Detect
Duplicate processor as recommended to no avail. The first time two
identical flowfiles are sent simultaneously they both make it through to
the the non-duplicate relationship. After that point they will be
appropriately detected.

In particular we are testing with generate flow file on a two node cluster.
We extract the Shakey and use that when detecting duplicates.

-Eric




On Tue, Apr 10, 2018, 1:32 PM Bryan Bende <bbende@gmail.com> wrote:

> Hello,
>
> DetectDuplicate uses a DistributedMapCacheClientService which would be
> connecting to a DistributedMapCacheServer on one of your nodes.
>
> So all nodes should be connecting to the same cache server which is
> where the information about previously seen data is stored.
>
> -Bryan
>
> On Tue, Apr 10, 2018 at 1:24 PM, Eric Ulicny <eulicny@umich.edu> wrote:
> > Hello,
> >
> > We have a use case where we execute processors on all nodes but would
> like
> > to use the detect duplicate processor to ensure records are unique. We
> are
> > observing that we must run it on one node to truly detect duplicates. Is
> > there any way to merge flowfiles from all running executors?
> >
> > -Eric
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message