nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: Getting Duplicate Flowfiles from InvokeHttp and QueryElasticsearchHttp
Date Mon, 18 Mar 2019 17:01:16 GMT
Hello,

Are you running a NiFi cluster of 2 nodes, or a standalone instance of NiFi?

-Bryan

On Mon, Mar 18, 2019 at 12:21 PM Martin Cooley <martin.cooley@gmail.com> wrote:
>
> If I configure an InvokeHttp processor to query against an elasticsearch node, I should
get one json object written to a flowfile.  If I use the QueryElasticsearchHttp processor,
if the query returns two documents from the index, I should get two json objects, each written
to their own flowfile.
>
> However, the InvokeHttp processor is writing two flowfiles.  They have separate UUIDs,
but the contents are the same.  Yes, the processor is scheduled to run every 900 seconds.
>
> The QueryElasticsearchHttp processor is writing 4 flowfiles.  It, too, is scheduled to
run every 900 seconds.
>
> Elasticsearch is returning:
>
> {
>   "took": 1,
>   "timed_out": false,
>   "_shards": {
>     "total": 5,
>     "successful": 5,
>     "skipped": 0,
>     "failed": 0
>   },
>   "hits": {
>     "total": 2,
>     "max_score": 0.2876821,
>     "hits": [
>       {
>         "_index": "etltodoc",
>         "_type": "document_record",
>         "_id": "2045680246129",
>         "_score": 0.2876821,
>         "_source": {
>           "myguid": "2045680246129",
>           "filename": "sample1.pdf",
>           "exception": "",
>           "original_filename": "\\\\f1\\DocsRepo\\CF\\sample1.pdf",
>           "conceptCode": "C2159782",
>           "timestamp": "2019-03-12T12:43:21.166531",
>           "status": "delivered"
>         }
>       },
>       {
>         "_index": "etltodock",
>         "_type": "document_record",
>         "_id": "2045680246128",
>         "_score": 0.2876821,
>         "_source": {
>           "myguid": "2045680246128",
>           "filename": "sample2.pdf",
>           "exception": "",
>           "original_filename": "\\\\f1\\DocsRepo\\CF\\sample2.pdf",
>           "conceptCode": "C2159782",
>           "timestamp": "2019-03-12T12:43:21.165467",
>           "status": "delivered"
>         }
>       }
>     ]
>   }
> }
>
> I'm hoping I just have something misconfigured, but I have tried playing with just about
every setting.  On the QueryElasticsearchHttp processor, if I set limit to one, I still get
two flowfiles instead of four.
>
> Any help will be much appreciated.
>
> Martin

Mime
View raw message