From users-return-8176-archive-asf-public=cust-asf.ponee.io@nifi.apache.org Tue Mar 13 04:13:25 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 1C8E118064D for ; Tue, 13 Mar 2018 04:13:24 +0100 (CET) Received: (qmail 77852 invoked by uid 500); 13 Mar 2018 03:13:23 -0000 Mailing-List: contact users-help@nifi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@nifi.apache.org Delivered-To: mailing list users@nifi.apache.org Received: (qmail 77837 invoked by uid 99); 13 Mar 2018 03:13:23 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Mar 2018 03:13:23 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id A7F4F1806E6 for ; Tue, 13 Mar 2018 03:13:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.148 X-Spam-Level: X-Spam-Status: No, score=0.148 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 8r7Pgcm0TrVh for ; Tue, 13 Mar 2018 03:13:21 +0000 (UTC) Received: from mail-pf0-f177.google.com (mail-pf0-f177.google.com [209.85.192.177]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 16E645F24F for ; Tue, 13 Mar 2018 03:13:21 +0000 (UTC) Received: by mail-pf0-f177.google.com with SMTP id f80so5290844pfa.8 for ; Mon, 12 Mar 2018 20:13:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding:content-language; bh=1LXSLHFLXfRNKmDjKWK0vcjGgpyN6Q+R/NM1Xf3dEHc=; b=Ab3amEwZe5ZwsKtF7/LZxECOZ3FSJNS+dZxtgigdFD30G6lhKHbNGFI7CjR4ImVt/F RRWJOEH76FjjQv7VfA626TiTIJALIjBU3/rjhM62lIQOlC5YMIX7WLnDuL1iFYFn2d2G niYbRhwaJM72Tl25KZfIyifQFjB3t/M19ysPnT2cL5wrPJuMrcV+plHqYtV4zUN+uk5p DAe9fEFkvKvi8S+4VuP0VrqHBP42FPAJoJEKNn3ucwD8A7fW3hoMDTW06f4B6AyCWy58 IVvchhF7oGqO1MzRubMdJzzwWwc5O9iCefZpLuNAS4BrVKOjnXpmlhp53b2pfxIIwJkr oZJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=1LXSLHFLXfRNKmDjKWK0vcjGgpyN6Q+R/NM1Xf3dEHc=; b=sCNx6l+uTgyJg56w5Zi6tdNHlHFk0Lprt4hVMDmLf4qhyoDMG5Rjfev/NHykNuoi+T TA/DJ6B7JFQwR4u5cTw/zLW5KlcUNniTYvZXmcY1QHQ8Qfv8q4HPs2dZRA8SmMFjBCFE OQj0pfsCY0Wmrpa72sCs7jtnE22BsCBlGEAHdQ7Z3RcjdLkgFMyVAoTdabr50bY4PBJb DnMRgWjmZEbP10q1XZzqvS9oZ1aYxjVs22IJ4TmizEY5NHfhN3z+wgC+1htrlALMwnpC 4HuaaAdRvfdEpdA9mw2Z/xRT2ILxrhrjxYcyrw+1gPfBNyDzaTIbeneJ2gMpX6UUSkL/ 3jhw== X-Gm-Message-State: AElRT7G3X7N2ec+iAOKKq2f4MlaVcghc5qfm4D+ybA+9s949PJA45pKW 2u7LIFiRMY39KVsG4m3xjn2t8wok X-Google-Smtp-Source: AG47ELvkjQ54zhNBZsTDnWpv38OG6JSq4naG94xw3+tCVulLzEZXueCXeea6J9Bzp5dQTERGz8ox/Q== X-Received: by 10.98.180.13 with SMTP id h13mr10015160pfn.139.1520910799621; Mon, 12 Mar 2018 20:13:19 -0700 (PDT) Received: from [192.168.1.150] (c-24-16-20-208.hsd1.wa.comcast.net. [24.16.20.208]) by smtp.gmail.com with ESMTPSA id q65sm20988871pfd.140.2018.03.12.20.13.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 12 Mar 2018 20:13:18 -0700 (PDT) Subject: Re: nifi unique json field list To: users@nifi.apache.org References: From: scott Message-ID: Date: Mon, 12 Mar 2018 20:13:17 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Thanks Matt and Boris for your comments/suggestions. I came to the same conclusion as Boris yesterday and put together a Groovy script. After I merge all the records into one flowfile, the Groovy script iterates through and comes up with a complete unique set of fields. I was hoping there was a solution using distributed cache or something similar, but I couldn't find a way to iterate through all keys/values in the cache without first knowing the names of the keys. Any thought's about that approach? Thanks again, Scott On 03/12/2018 07:26 AM, Matt Burgess wrote: > Scott, > > I'm assuming you're trying to create a single (possibly sparse) table > for all the incoming JSON records, and that you do not know the total > set of fields ahead of time. If you do know the fields ahead of time, > you can create the table ahead of time and use the JoltTransformJSON > processor to add default values for any missing fields in any records. > You may also be able to do this with ConvertRecord. > > If you do not know the total set of fields ahead of time, how will you > know you have "enough" records in the flow whose fields fully define > the table? If you won't ever know for sure that your flow contains > records whose values fully define the table, you'd have to try to add > missing columns as they are encountered (see [1] for SQL Server > commands to do this). That can get messy, you'd have to pull the > field names out into attributes and use PutSQL's SQL Statement > property (added in NiFi 1.5.0) with NiFi Expression Language to > execute the SQL command for adding a column if it doesn't exist. > Otherwise you'd have to use Wait/Notify or some other barrier stuff to > ensure the column was added before a record with that field is pushed > to the database. > > Basically it's very difficult to do if you don't know all the possible > table columns up front. If you are consuming from a Web service, I > would hope their API at least documents all possible fields, and you > could create a table from that. > > Regards, > Matt > > [1] https://stackoverflow.com/questions/8870802/add-a-column-to-a-table-if-it-does-not-already-exist > > > On Sat, Mar 10, 2018 at 7:18 PM, scott wrote: >> Hello NiFi community, >> >> I have a problem I've been working on that I was hoping to get some >> suggestions for a solution. I'm ingesting a dump of json records from a web >> service periodically. The json sometimes comes with some fields missing. My >> goal is to insert the data into a SQLServer DB after dynamically creating >> the table based on the fields in the json records. The problem is that when >> I use the list of fields from a record that is missing some fields, I get >> errors when I try to insert. How do I take all the fields from all records >> in my dump and then create a unique list of all fields for me to create the >> table with? >> >> >> Thanks, >> >> Scott >>