From users-return-8505-archive-asf-public=cust-asf.ponee.io@nifi.apache.org Fri Apr 13 09:11:18 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 0E1FD180627 for ; Fri, 13 Apr 2018 09:11:17 +0200 (CEST) Received: (qmail 54150 invoked by uid 500); 13 Apr 2018 07:11:12 -0000 Mailing-List: contact users-help@nifi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@nifi.apache.org Delivered-To: mailing list users@nifi.apache.org Received: (qmail 54140 invoked by uid 99); 13 Apr 2018 07:11:11 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Apr 2018 07:11:11 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 852891A018E for ; Fri, 13 Apr 2018 07:11:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.102 X-Spam-Level: X-Spam-Status: No, score=-0.102 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id BeeZV58konBY for ; Fri, 13 Apr 2018 07:11:07 +0000 (UTC) Received: from mail-yw0-f182.google.com (mail-yw0-f182.google.com [209.85.161.182]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id F3EB65F47E for ; Fri, 13 Apr 2018 07:11:06 +0000 (UTC) Received: by mail-yw0-f182.google.com with SMTP id y23so3760615ywy.4 for ; Fri, 13 Apr 2018 00:11:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-transfer-encoding; bh=vo60gdy9uhzxLPPdNkxh1dnhDxKQKdql0DIY0uea8+0=; b=cg1fQWI5o7dTPmpIUkmA9ZQPUvj5vreYBV3PxrlGIVs46oCbR8/x6aEnNWZgunyp6o prZuiriKM6gyiEm3rahY3YR4SH+eJxkt/z7IZGcfyZT+lkUw9iS2ynE4Dr+xJs1I9qkO sl8Ev1+OkDdVcXsrY638gvQaG3mYPQQPrt7rgij3VLSku///JCWhi9IWM4T25X6dCuSw rC1BH7XXI+P2UbBiBVD2sHqul1cVwHj3kk2Jmu5XfElfBOb7/wQAJM5GAmGJ4afyadQR 45N4FNDC6B4nwpEPrclajjean/J6lg0UYrq9XzllEbLtXVdzxdDfv7d/m029a2ZAQme0 VEHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-transfer-encoding; bh=vo60gdy9uhzxLPPdNkxh1dnhDxKQKdql0DIY0uea8+0=; b=LCVDfs9LMCC8b87D2u8OhR3C2dsPDBXZFiSfVZiHXeMwGdQ+p7B4Hep0nArc873opt C6kxZsoFvrFXsKHVP7emYTfcNSm+LB+yvNl6TC2MYAecp44WlTuOxKL8KjHERhqgCba4 xUl07WzBKbnfuJIRMpQN+LqOH0QG6/dcvikkCCDGIJN2kqiRTKAoyHfa7Rpq6qK+oWUC ZwJSX7tOcnj6vwP9Cbv85JN9zSEpjSx/3X3FVwONWZuORGShJGJcPcKjNENIZsXMqHWD y79iOu+pQwBUKE7xiS4d/9MZTNY7JYzwfOGk6qfvlBFeKNa7ClPunAuyzY80j6/Lesb3 KYsQ== X-Gm-Message-State: ALQs6tBXqp6eB9WcAmQKyEtHCW6coR1ye20iDEBCboWRVttxlWPcjbQv e6XDkwPgUAAhKrDIY4xrcnYqi7vPrptocCXwwTkMCHjL X-Google-Smtp-Source: AIpwx4+KmkHuUSBMcpk9KmtxQYtpNntw0r284ivoLqpzAEkPsVlqdkrti/S9TRbL073Pptzz/a89wMPTQAd+iR4Xkcw= X-Received: by 10.13.230.146 with SMTP id p140mr3454975ywe.22.1523603465501; Fri, 13 Apr 2018 00:11:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.136.237 with HTTP; Fri, 13 Apr 2018 00:11:04 -0700 (PDT) In-Reply-To: References: From: Koji Kawamura Date: Fri, 13 Apr 2018 16:11:04 +0900 Message-ID: Subject: Re: MergeRecord To: users@nifi.apache.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, I've tested InferAvroSchema and MergeRecord scenario. As you described, records are not merged as expected. The reason in my case is, InferAvroSchema generates schema text like this: inferred.avro.schema { "type" : "record", "name" : "example", "doc" : "Schema generated by Kite", "fields" : [ { "name" : "Key", "type" : "long", "doc" : "Type inferred from '4'" }, { "name" : "Value", "type" : "string", "doc" : "Type inferred from 'four'" } ] } And, MergedRecord uses that schema text as groupId even if 'Correlation Attribute' is specified. https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-b= undle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/sta= ndard/MergeRecord.java#L348 So, even if schema is the same, if actual values vary, merging group id will be different. If you can use SchemaRegistry, it should work as expected. Thanks, Koji On Fri, Apr 13, 2018 at 2:45 PM, DEHAY Aurelien wrote: > > Hello. > > Thanks for the answer. > > The 20k is just the last test, I=E2=80=99ve tested with 100,1000, with an= input queue of 10k, and it doesn=E2=80=99t change anything. > > I will try to simplify the test case and to not use the inferred schema. > > Regards > >> Le 13 avr. 2018 =C3=A0 04:50, Koji Kawamura a = =C3=A9crit : >> >> Hello, >> >> I checked your template. Haven't run the flow since I don't have >> sample input XML files. >> However, when I looked at the MergeRecord processor configuration, I fou= nd that: >> Minimum Number of Records =3D 20000 >> Max Bin Age =3D 10 sec >> >> By briefly looked at MergeRecord source code, it expires a bin that is >> not complete after Max Bin Age. >> Do you have 20,000 records to merge always within 10 sec window? >> If not, I recommend to lower the minimum number of records. >> >> I haven't checked actual MergeRecord behavior so I may be wrong, but >> worth to change the configuration. >> >> Hope this helps, >> Koji >> >> >> On Fri, Apr 13, 2018 at 12:26 AM, DEHAY Aurelien >> wrote: >>> Hello. >>> >>> Please see the template attached. The problem we have is that, however = any configuration we can set in the mergerecord, we can't manage it to actu= ally merge record. >>> >>> All the record are the same format, we put an inferschema not to have t= o write it down ourselves. The only differences between schemas is then tha= t the doc=3D"" field are different. Is it possible for it to prevent the me= rging? >>> >>> Thanks for any pointer or info. >>> >>> >>> Aur=C3=A9lien DEHAY >>> >>> >>> >>> This electronic transmission (and any attachments thereto) is intended = solely for the use of the addressee(s). It may contain confidential or lega= lly privileged information. If you are not the intended recipient of this m= essage, you must delete it immediately and notify the sender. Any unauthori= zed use or disclosure of this message is strictly prohibited. Faurecia doe= s not guarantee the integrity of this transmission and shall therefore neve= r be liable if the message is altered or falsified nor for any virus, inter= ception or damage to your system. > > This electronic transmission (and any attachments thereto) is intended so= lely for the use of the addressee(s). It may contain confidential or legall= y privileged information. If you are not the intended recipient of this mes= sage, you must delete it immediately and notify the sender. Any unauthorize= d use or disclosure of this message is strictly prohibited. Faurecia does = not guarantee the integrity of this transmission and shall therefore never = be liable if the message is altered or falsified nor for any virus, interce= ption or damage to your system. >