nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlos Manuel Fernandes (DSI)" <carlos.antonio.fernan...@cgd.pt>
Subject RE: ELT on Nifi
Date Tue, 04 Oct 2016 16:53:37 GMT
Hi Joe,

I can contribute the Template , which image I send before. For build a processor , I’m not
java skilled enough  for that task, I mostly program in Groovy .  If someone  take that task,
I can help with ideas and tests.

Thanks

Carlos



From: Joe Witt [mailto:joe.witt@gmail.com]
Sent: terça-feira, 4 de Outubro de 2016 01:23
To: users@nifi.apache.org
Subject: Re: ELT on Nifi

Carlos,

I think you're right that more can be done to support a broad range of transforms and styles
of transforms.  The approach you're suggesting makes sense for the style you prefer and I
could envision such a processor that can execute the transform/statements you're showing in
that JSON sample.  Are you proposing to contribute such a processor?

Thanks
Joe

On Mon, Oct 3, 2016 at 2:25 PM, Carlos Manuel Fernandes (DSI) <carlos.antonio.fernandes@cgd.pt<mailto:carlos.antonio.fernandes@cgd.pt>>
wrote:
Hi all,

When i saw Nifi for the first time , I try to build  a classical ETL/ELT flow , and this question
is recurrent for the new users.

Nifi has very good processors for the Extract and Load, the problem arise on Transform, because
in ETL/ELT  tools there are specific “processors”  (ex: map, SCD, etc.)  binded to DW
concepts  and sometimes binded  to a specific database (ex: SCDNetezza) . The Transformer
processors in Nifi  are general purpose  and not correlated with  this concepts. The immediate
solution is to create a lot of Custom script processors but  the metadata of ELT (sql) turn
attributes or code of processors, not an ideal solution.

But, If we put  the logic of Transform  outside of Nifi, for example in some Json structure
, then its relative easy, construct a ELT NIFI Template capable of run a generic ELT flows.

Example of a ELT JSon Structure  (the “steps” inside  the “flow” are to be executed
on PutSql in the same transaction)
{
       "Transformer": [{
             "name": "foo1",
             "type": "Map",
             "description": "Summarize the table foo from table bar",
             "flow": [{
                    "step": 1,
                    "description": "delete all data",
                    "stmt": "delete from  foo"
             }, {
                    "step": 2,
                    "Description": "Count f2 by f1",
                    "stmt": "insert into foo(c1, c2) select c1,sum(c2) from bar group by c1"
             }]
       }, {
             "name": "foo2",
             "type": "SCD- Slowly change Dimensions type 1",
             "description": "Update a prod table based on stage table",
             "flow": [{
                    "step": 1,
                    "description": "Process type 1",
                    "stmt": "Update Prod Set Prod.columns = Stage.Columns From Stage Inner
Join Prod on Stage.key = Prod.key Where Stage.IsType1 = 1 "
             }]
       }]
}

Example of a  NIFI template who execute that Json structure :

[cid:image001.png@01D21E64.24D94F70]


This make sense?  Give me feedback.

Carlos




Mime
View raw message