nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Thomsen <mikerthom...@gmail.com>
Subject Re: [DISCUSS] Deprecate processors who have Record oriented counterpart?
Date Mon, 25 Feb 2019 20:18:05 GMT
In the last year, I've joined two teams that were getting started with NiFi
and I think I was the only person on the team that knew about the Record
API.

A few days ago, someone at my client gave a presentation on NiFi and was
talking about the "NiFi forums." Doing a quick Google search for "NiFi
Community Support" showed that HortonWorks's fora are above any
nifi.apache.org reference in priority. So we might have a SEO problem on
our hands too in terms of getting our preferred documentation and guides
into users' hands.

On Mon, Feb 25, 2019 at 3:12 PM Andy LoPresto <alopresto@apache.org> wrote:

> I think there are legitimate use cases for the “legacy” approaches and we
> should not deprecate them. However, I do think there can be better
> education and gentle guidance of new users to prefer the record-oriented
> processors over the legacy processors when appropriate. Whether this is a
> linked note in the processor description shown in the Add Processor dialog,
> improvement documentation on the website, wizard/walkthroughs, etc. is
> certainly a good topic for conversation here.
>
> The ConvertXtoY processors should definitely be deprecated.
>
>
> Andy LoPresto
> alopresto@apache.org
> alopresto.apache@gmail.com
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> > On Feb 23, 2019, at 12:42 PM, Bryan Bende <bbende@gmail.com> wrote:
> >
> > One thing I would add is that in the 1.9.0 release there is now schema
> > inference built in so that you can just start using the record processors
> > without having a schema.
> >
> > That being said I am neutral about deprecating the non-record processors
> > for source and destination systems.
> >
> > The processors I would definitely be in favor of deprecating are the
> > conversion processors that are replaced by ConvertRecord (Avro to JSON,
> > JSON to Avro, csv to avro, whatever other combos) and InferAvroSchema.
> All
> > of those should be handled by ConvertRecord + the built in schema
> inference
> > option in the readers and writers.
> >
> > On Sat, Feb 23, 2019 at 1:23 PM Mike Thomsen <mikerthomsen@gmail.com>
> wrote:
> >
> >> Sivaprasanna,
> >>
> >> FWIW, I think there might be merit to deprecating converting to Avro,
> but
> >> the rest I think should stay. With Avro, I feel like there is intrinsic
> >> danger in giving people that option if they're unwilling to learn how to
> >> write an Avro schema.
> >>
> >> On Sat, Feb 23, 2019 at 1:21 PM Mike Thomsen <mikerthomsen@gmail.com>
> >> wrote:
> >>
> >>>> The number 1 thing I don't like about the Record processors is that
> >> they
> >>> require a Schema, and the complimentary processor(s?), specifically the
> >>> GetMongo one, does not require a schema.
> >>>
> >>> FWIW, we just added GetMongoRecord in 1.9.0, along with GridFS
> >> processors.
> >>>
> >>> I'll note that arguably the best reason for you to take the dive into
> >>> being able to use the Record API w/ Mongo is precisely that Mongo
> doesn't
> >>> even have schema on write. It's entirely possible that 9 out of 10
> people
> >>> on your team write a date the right way you agreed upon and then the 1
> >> hold
> >>> out does the polar opposite and you won't know until random, bizarre
> >>> behavior shows up.
> >>>
> >>> On Sat, Feb 23, 2019 at 12:06 PM Ryan Hendrickson <
> >>> ryan.andrew.hendrickson@gmail.com> wrote:
> >>>
> >>>> We often don't use the Record Processors because of the Schema
> >> requirement
> >>>> and complexity to use the LookupRecord processor.
> >>>>
> >>>> I'll refer to this email in the NiFi mailing list: "GetMongo - Pass-on
> >>>> Initial FlowFile?"... There were suggestions to use the LookupRecord
> >>>> processor, but ultimately it couldn't do what we needed to be done,
so
> >> we
> >>>> had to string together a set of other processors.
> >>>>
> >>>> For us, it was easier to string together a set of processors than to
> >>>> figure
> >>>> out why LookupRecord, MongoDBLookupService, and InferAvroSchema wasn't
> >>>> getting the job done for us.
> >>>>                 /---success---> *ReplaceText* (Prepend JSON Key)
> >>>> ---success-->  \
> >>>>                /
> >>>>                                                \
> >>>> *GetMongo*
> >>>>                                          -------> *Merge Content*
> >>>> (Combine
> >>>> on Correlation Attribute Name, Binary Concat)
> >>>>                \
> >>>>                                                /
> >>>>                 \---original---> *ReplaceText*  (Prepend JSON Key)
> >>>> ---success--> /
> >>>>
> >>>>
> >>>> If they're marked as deprecated, I'd really like to see barrier to
> entry
> >>>> with the LookupRecord processors decreased.  The number 1 thing I
> don't
> >>>> like about the Record processors is that they require a Schema, and
> the
> >>>> complimentary processor(s?), specifically the GetMongo one, does not
> >>>> require a schema.
> >>>>
> >>>> Ryan
> >>>>
> >>>> On Sat, Feb 23, 2019 at 11:39 AM Andrew Grande <aperepel@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> I'm not sure deprecating is warranted. In my experience, record
based
> >>>>> processors are very powerful, but have a steep learning curve the
way
> >>>> they
> >>>>> are in NiFi today, and, frankly, simple things should be dead simple.
> >>>>>
> >>>>> Now, moving the record UX towards an easy extreme affects this
> >> equation,
> >>>>> but e.g. I never open up a conversation with a new user by talking
> >> about
> >>>>> records, Schema Registry or NiFi Registry.
> >>>>>
> >>>>> Maybe there's something coming up which I'm not aware yet? Please
> >> share.
> >>>>>
> >>>>> Andrew
> >>>>>
> >>>>> On Sat, Feb 23, 2019, 7:43 AM Sivaprasanna <
> sivaprasanna246@gmail.com
> >>>
> >>>>> wrote:
> >>>>>
> >>>>>> Team,
> >>>>>>
> >>>>>> Ever since the Record based processors were first introduced,
there
> >>>> has
> >>>>>> been active development in improving the Record APIs and constant
> >>>>> interest
> >>>>>> in introducing new set of Record oriented processors. It has
gone
> >> to a
> >>>>>> level where almost all the processors that deal with mainstream
tech
> >>>>> have a
> >>>>>> Record based counterpart, such as the processors for MongoDB,
Kafka,
> >>>>> RDBMS,
> >>>>>> HBase, etc., These record based processors have overcome the
> >>>> limitations
> >>>>> of
> >>>>>> the standard processors letting us build flows which are concise
and
> >>>>>> efficient especially when we are dealing with structured data.
And
> >>>> more
> >>>>>> over with the recent release of NiFi (1.9), we now have a new
> >> feature
> >>>>> that
> >>>>>> offers schema inference capability which even simplifies the
process
> >>>> of
> >>>>>> building flows with such processors. Having said that, I'm wondering
> >>>> if
> >>>>>> this is a right time to raise the talk of deprecating processors
> >> which
> >>>>> the
> >>>>>> community believes has a much better record oriented counterpart,
> >>>>> covering
> >>>>>> all the functionalities currently offered by the standard processor.
> >>>>>>
> >>>>>> There are a few things that has to be talked about, like how
should
> >>>> the
> >>>>>> deprecated processor be displayed in the UI, etc., but even
before
> >>>> going
> >>>>>> through that route, I want to understand the community's thoughts
on
> >>>>> this.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Sivaprasanna
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> > --
> > Sent from Gmail Mobile
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message