nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy LoPresto <alopre...@apache.org>
Subject Re: [EXT] [discuss] Splitting NiFi framework and extension repos and releases
Date Fri, 12 Jul 2019 22:23:21 GMT
Adam,

I think your statements about “definitely a total sum of work that is greater” are true for a specific audience. As someone who routinely reviews PRs and handles release management tasks, I know where I would like to see improvements. I also write feature code (though not as often lately), but I think I’ve experienced the “contributor” role enough to be able to balance my expectations of required work. 

And yes, I’d much rather be responsible for routinely releasing nifi-commons with the self-contained security frameworks and services on a routine cadence and only having to worry about the constrained feature set and repeatable tests than RM the entire application framework once every 3-4 months and have to wait for that. 

Jeff,

I think your entire third paragraph (was about to quote a line but the whole thing is great) is exactly where I am. Complex changes require complex thought from the contributor and from the reviewers. Smaller changes are easier to write correctly, review intelligently, and merge quickly. I accept the tradeoffs that requires. 


Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Jul 12, 2019, at 12:33 PM, Jeff <jtswork@gmail.com> wrote:
> 
> Adam,
> 
> To your point, we currently have the situation you described when
> dependencies exist between NiFi and NiFi Registry, where one must be
> released before the other.
> 
> In my opinion, it is better to be able to verify functionality in smaller
> atomic change-sets and limited scope over a larger change-set.  It may
> initially delay the release windows to coordinate across multiple
> repositories, but over the long term I think it will save more time.  It is
> generally the case that spending more time upfront saves more time in the
> long run.  The same can be said for a mono-repository; split PRs that
> address multiple areas/modules of code into smaller PRs for easier review.
> Both methods encourage the developers and reviewers to think critically
> about the changes being applied.  In a multi-repository setup, that
> critical thinking is basically forced upon the developers and reviewers,
> and I think that is a good thing for the quality of the software.
> 
> Regarding the raising or lowering of the bar for contribution, we should
> find a balance that encourages contribution quality.  Splitting our modules
> across multiple repositories may raise the bar for features or bug fixes
> that span repositories, but a contributor requires knowledge of those
> multiple modules regardless in which those modules live. If a change is
> being made to multiple modules that span more than one repository the only
> impacts to the current workflow are multiple PRs instead of a single PR,
> and the trade-off between shorter per-PR review time of multiple smaller
> PRs verses longer review time for a single larger PR.
> 
> 
> 
> On Fri, Jul 12, 2019 at 3:12 PM Adam Taft <adam@adamtaft.com> wrote:
> 
>> Andy - fair points. Note that by definition, the process you describe is
>> harder (requires more maneuvers).  Maybe it's warranted/justified for the
>> desired integrity that you are after, but it's most definitely a total sum
>> of work that is greater.
>> 
>> Your registry example is really good.  In your example, you are proposing a
>> change to the framework and commons repositories before a change to the
>> registry can be finalized.  You'd need the changes to framework and commons
>> to "land" and become released before the final change to the registry was
>> committed.  You'd end up with a small release queued up for the framework
>> (whose release cycle is mostly infrequent) and you wouldn't be able to
>> finish the work on the registry changes until that new function was
>> releasable.  The ability to mark that JIRA ticket as "closed" is delayed
>> because you are waiting for releases from dependent components.
>> 
>> Of course, you can build/test against -SNAPSHOT versions in each of those
>> repositories (which is what Bryan was getting to).  But the registry
>> feature itself can't be totally finalized and is waiting on the release
>> cycle of the slowest of the components.  There are definitely tradeoffs
>> with this direction.
>> 
>> 
>> On Fri, Jul 12, 2019 at 12:42 PM Andy LoPresto <alopresto@apache.org>
>> wrote:
>> 
>>> I think by definition, a contribution _must_ fit into a single
>> repository.
>>> This will force developers to carefully consider the boundaries between
>>> modules and build clean abstractions. If you are a new contributor, I
>> would
>>> be surprised if you are making a single (logical) contribution that would
>>> span multiple repositories on the first go. I think enforcing clear
>>> divisions is good for both new and experienced contributors. I also
>> think a
>>> change that requires contributions to multiple repositories should be
>>> subdivided into atomic tasks.
>>> 
>>> For example, if someone wants to contribute a new feature to
>> nifi-registry
>>> which also requires changes to nifi-commons for the security piece and
>> adds
>>> new behavior to the nifi-framework component to consume new changes from
>>> Registry, in my mind those are actually 3 atomic changes which, while
>>> related and interdependent, can all be contributed as standalone code to
>>> their respective repositories in an ordered fashion. I would prefer this
>>> over one large commit to a single repository which influences behavior in
>>> all three modules and requires one or more reviewers with comprehensive
>>> knowledge over all aspects of the project.
>>> 
>>> 
>>> Andy LoPresto
>>> alopresto@apache.org
>>> alopresto.apache@gmail.com
>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>> 
>>>> On Jul 12, 2019, at 10:49 AM, Adam Taft <adam@adamtaft.com> wrote:
>>>> 
>>>> Bryan,
>>>> 
>>>> I think both of your points are actually closely related, and they
>>> somewhat
>>>> speak to my thoughts/concerns about splitting the repository.
>>>> 
>>>> I would argue that one PR that affects multiple modules in a single
>>>> repository is _easier_ to review than multiple PRs that affect single
>>>> modules.  In the split repository model, if a change affects several
>>>> repositories, individual PRs would be issued against each repository.
>> A
>>>> reviewer would not as easily see the context of the changes and may
>> even
>>>> consider them out of order.
>>>> 
>>>> In the single repository model, a PR is atomic. There is no race
>>> condition,
>>>> ordering or loss of context across multiple repositories.
>>>> 
>>>> This is the concern I was making for new contributors.  If your
>>>> contribution doesn't fit neatly into a single repository, then it's
>> quite
>>>> the tough process to communicate and deal with changes. It will
>>> discourage
>>>> new folks from being involved, because the contribution barrier is
>>> raised.
>>>> 
>>>> It's ideal that changesets are atomic, but you definitely lose this
>>>> property in a multi-repo scenario.  Imagine rolling back a change, for
>>>> example, that spans multiple repositories.
>>>> 
>>>> Adam
>>>> 
>>>> On Fri, Jul 12, 2019 at 11:27 AM Bryan Bende <bbende@gmail.com> wrote:
>>>> 
>>>>> Two other points to throw out there...
>>>>> 
>>>>> 1) I think something to consider is how the management of pull
>>>>> requests would be impacted, since that is the main form of
>>>>> contribution.
>>>>> 
>>>>> Separate repos forces pull requests to stay scoped to a given module,
>>>>> making for more straight forward reviews. It also makes it easier to
>>>>> look at a repo and see what work/contributions are still open,
>>>>> although I suppose all the PRs in the nifi repo could be labeled by
>>>>> module and then filtered, but it seems a little more tedious. Just
>>>>> something to think about.
>>>>> 
>>>>> 
>>>>> 2) We should also consider how we plan to handle changes across
>> modules.
>>>>> 
>>>>> As an example, currently we have nifi and nifi-registry in separate
>>>>> repos, and nifi depends on nifi-registry, but nifi master always stays
>>>>> on the last release version of nifi-registry.
>>>>> 
>>>>> So if you are working on a change across both projects, the process is
>>>>> something like the following...
>>>>> 
>>>>> - Make change in nifi-registry and run a Maven install locally
>>>>> - Change nifi pom to the snapshot version of nifi-registry
>>>>> - Make changes in nifi and stage them in a branch, possibly a draft PR
>>>>> that can't be merged yet
>>>>> - nifi-registry gets released
>>>>> - Put up a PR for the nifi work, bumping the nifi-registry version to
>>>>> released version
>>>>> 
>>>>> I have no issue continuing to work like this, as long as we accept
>>>>> that the complexity of these scenarios will increase with more
>>>>> modules.
>>>>> 
>>>>> An alternative approach would be to allow master of each module to
>>>>> depend on a snapshot of a dependent module. For example, the nifi PR
>>>>> above could be merged before nifi-registry is ever released. It lets
>>>>> the work proceed instead of letting these draft changes build up, and
>>>>> it forces the dependency chain of releases to occur since now you
>>>>> can't release nifi master until nifi-registry is released. The
>>>>> downside is it requires everyone to locally build all the snapshot
>>>>> modules to get the latest changes, even if they aren't working on
>>>>> those other modules, unless there is a way for them to be provided
>>>>> through an apache infra build process.
>>>>> 
>>>>> This second point is less about mono vs multi repo, and more about how
>>>>> to manage development of a change that requires modifying several
>>>>> dependent modules.
>>>>> 
>>>>> On Fri, Jul 12, 2019 at 1:05 PM Kevin Doran <kdoran@apache.org>
>> wrote:
>>>>>> 
>>>>>> Thanks Adam and Edward! This is exactly the type of discussion I was
>>>>>> hoping a detailed and specific proposal would generate, so thanks for
>>>>>> the input. I'll reply to each of you in turn:
>>>>>> 
>>>>>> Adam,
>>>>>> 
>>>>>> It is true that a repo-per-project approach is not required. I've
>>>>>> worked on projects that do it both ways and there are advantages to
>>>>>> both.
>>>>>> 
>>>>>> Single-repo was considered, but as one of the primary goals is to cut
>>>>>> down on Travis / CI build times, the mutli-repo approach seemed to
>>>>>> have a big advantage. Personally, I've never found a reliable, stable
>>>>>> way to introduce CI builds to a repository with multiple projects
>> that
>>>>>> did not require building all the projects in the repository. It's
>>>>>> possible to try to use commands to determine which files have changed
>>>>>> and infer which project(s) to build from that, but maintaining that
>>>>>> logic can get messy. If the logic is wrong, it's possible a project
>>>>>> that is not built is broken by a PR. Building everything is not an
>>>>>> option for a project our size, as our build already time out today.
>>>>>> Fast, reliable Travis builds with no false positives / negatives is
>>>>>> definitely something NiFi needs, and I think it will be simplest to
>>>>>> get there with a multi-repo approach.
>>>>>> 
>>>>>> That said, I agree that the *biggest* win comes from splitting
>>>>>> projects, and that splitting repos is a smaller step. I don't feel
>>>>>> strongly about it and could live with a single repo with multiple
>>>>>> projects (though, for what it's worth, the NiFi umbrella already has
>>>>>> several repositories and I personally don't feel it has been
>>>>>> burdensome).
>>>>>> 
>>>>>> And I agree - let's not start splitting JIRA projects. Let's use
>>>>>> components or labels or something to differentiate issues under the
>>>>>> existing NIFI Jira project.
>>>>>> 
>>>>>> 
>>>>>> Edward,
>>>>>> 
>>>>>> Thanks. I totally agree and I know others who feel the same way.
>>>>>> Better defined boundaries and loosely coupled modules is 100% a
>>>>>> long-term goal. I think this project restructuring won't solve the
>>>>>> problem completely (in fact, to your point, it may uncover some
>>>>>> unfortunate tight-coupling that needs to be reworked on the current
>>>>>> master before the split can happen), but I do think it will encourage
>>>>>> developers to more faithfully build to APIs and avoid leaky
>>>>>> abstractions as there will be more hard division points in the code
>>>>>> base. Some of those issues might be able to be addressed immediately.
>>>>>> Others might have to wait for a major version change.
>>>>>> 
>>>>>> Thanks,
>>>>>> Kevin
>>>>>> 
>>>>>> On Fri, Jul 12, 2019 at 1:04 PM Adam Taft <adam@adamtaft.com> wrote:
>>>>>>> 
>>>>>>> To be honest and to your point Joe, the thing that optimizes the RM
>>>>> duties
>>>>>>> should probably be preferred in all of this.  There is so much
>>>>> overhead for
>>>>>>> the release manager, that lubricating the RM process probably
>> trumps a
>>>>> lot
>>>>>>> of my concerns.  I think there's real concern for making the project
>>>>> harder
>>>>>>> for new contributors. But likewise, that concern should be balanced
>>>>> with
>>>>>>> making the project harder for longtime contributors who have pulled
>>> the
>>>>>>> cart the most.
>>>>>>> 
>>>>>>> I was just at least hoping for a discussion on the concept.  Thanks
>> as
>>>>>>> always for your leadership and contributions to the nifi community.
>>>>>>> 
>>>>>>> On Fri, Jul 12, 2019 at 10:48 AM Joe Witt <joe.witt@gmail.com>
>> wrote:
>>>>>>> 
>>>>>>>> Ah I agree the JIRA thing would be too heavy handed.  A single JIRA
>>>>> with
>>>>>>>> well defined components tied to 'repos' is good.
>>>>>>>> 
>>>>>>>> As far as separate code repos we're talking about different
>>>>> releasable
>>>>>>>> artifacts for which we as a PMC are responsible for the
>>>>> meaning/etc..  As a
>>>>>>>> many time RM I definitely dislike the mono repo construct as I
>>>>> understand
>>>>>>>> it to function.  I prefer repos per source release artifact where
>> all
>>>>>>>> source in that artifact is a function of the release. I am ok with
>>>>>>>> different convenience binaries resulting from a single source
>> release
>>>>>>>> artifact though.
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> 
>>>>>>>> On Fri, Jul 12, 2019 at 12:26 PM Adam Taft <adam@adamtaft.com>
>>>>> wrote:
>>>>>>>> 
>>>>>>>>> I think the concerns around user management are valid, are they
>>>>> not?
>>>>>>>>> Overhead in JIRA goes up (assigning rights to users in JIRA is
>>>>>>>>> multiplied).  Risk to new contributors is high, because each
>>>>> isolated
>>>>>>>>> repository has its own life and code contribution styles.  Maybe
>>>>> the
>>>>>>>> actual
>>>>>>>>> apache infra involvement is low, but the negative effects of
>>>>> community
>>>>>>>> and
>>>>>>>>> source code bifurcation goes up.
>>>>>>>>> 
>>>>>>>>> Tagging in mono-repos is done by prefixing the name of the
>>>>> component in
>>>>>>>> the
>>>>>>>>> tag name.  Your release sources are still generated from the
>>>>> component
>>>>>>>>> folder (not from the root).
>>>>>>>>> 
>>>>>>>>> Modularization (as being proposed) is a good thing, but can be
>>>>> done in a
>>>>>>>>> single repository. It's not a requirement to split up the git
>>>>> project to
>>>>>>>>> get the benefits of modularization.  That's the point I'm hoping
>>>>> is seen
>>>>>>>> in
>>>>>>>>> this.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Jul 12, 2019 at 10:08 AM Joe Witt <joe.witt@gmail.com>
>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> to clarify user management for infra is not a prob.  it is an
>>>>> ldap
>>>>>>>> group.
>>>>>>>>>> 
>>>>>>>>>> repo creation is self service as well amd group access is tied
>>>>> to that.
>>>>>>>>>> 
>>>>>>>>>> release artifact is the source we produce.  this is typically
>>>>>>>> correlated
>>>>>>>>> to
>>>>>>>>>> a tag of the repo.  if we have all source in one repo it isnt
>>>>> clear to
>>>>>>>> me
>>>>>>>>>> how we can maintain that.
>>>>>>>>>> 
>>>>>>>>>> in any event im not making a statement of whether to do many
>>>>> repos or
>>>>>>>>> not.
>>>>>>>>>> just correcting some potentially misleading claims.
>>>>>>>>>> 
>>>>>>>>>> thanks
>>>>>>>>>> 
>>>>>>>>>> On Fri, Jul 12, 2019, 12:01 PM Adam Taft <adam@adamtaft.com>
>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Just as a point of discussion, I'm not entirely sure that
>>>>> splitting
>>>>>>>>> into
>>>>>>>>>>> multiple physical git repositories is actually adding any
>>>>> value.  I
>>>>>>>>> think
>>>>>>>>>>> it's worth consideration that all the (good) changes being
>>>>> proposed
>>>>>>>> are
>>>>>>>>>>> done under a single mono-repository model.
>>>>>>>>>>> 
>>>>>>>>>>> If we split into multiple repositories, you have substantially
>>>>>>>>> increased
>>>>>>>>>>> the infra surface area. User account management overhead goes
>>>>> up.
>>>>>>>>> Support
>>>>>>>>>>> from the infra team goes up. JIRA issue management goes up,
>>>>>>>>>>> misfiled/miscategorized issues become common. It becomes
>>>>> harder for
>>>>>>>>>>> community members to interact and engage with the project,
>>>>> steeper
>>>>>>>>>> learning
>>>>>>>>>>> curve for new contributors. There are more "side channel"
>>>>>>>> conversations
>>>>>>>>>> and
>>>>>>>>>>> less transparency into the project as a whole. Git history is
>>>>> much
>>>>>>>>> harder
>>>>>>>>>>> (or impossible) to follow across the entire project. Tracking
>>>>> down
>>>>>>>> bugs
>>>>>>>>>> and
>>>>>>>>>>> performing git blame or git bisect becomes hard.
>>>>>>>>>>> 
>>>>>>>>>>> There's nothing really stopping all of these changes from
>>>>> occurring
>>>>>>>> in
>>>>>>>>>> the
>>>>>>>>>>> existing repo, we don't have to have a maven pom.xml in the
>>>>> root of
>>>>>>>> the
>>>>>>>>>>> project repository. It's much easier for contributors to just
>>>>> clone a
>>>>>>>>>>> single repository, read the README at the root, and get
>>>>> oriented to
>>>>>>>> the
>>>>>>>>>>> project layout.  Output artifacts can still be versioned
>>>>> differently
>>>>>>>>> (api
>>>>>>>>>>> can have a different version from extensions).  "Splitting out"
>>>>>>>> modules
>>>>>>>>>> can
>>>>>>>>>>> still happen in the mono-repository.  Jenkins and friends can
>>>>> be
>>>>>>>> taught
>>>>>>>>>> the
>>>>>>>>>>> project layout.
>>>>>>>>>>> 
>>>>>>>>>>> tl;dr - The changes being proposed can be done in a single
>>>>>>>> repository.
>>>>>>>>>>> Splitting into multiple repositories is adding overhead on
>>>>> multiple
>>>>>>>>>> levels,
>>>>>>>>>>> which might be a sneaky form of muda. [1]
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for reading,
>>>>>>>>>>> Adam
>>>>>>>>>>> 
>>>>>>>>>>> [1] https://dzone.com/articles/seven-wastes-software
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Jul 11, 2019 at 11:01 AM Otto Fowler <
>>>>>>>> ottobackwards@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> I agree that this looks great. I think Mike’s idea is worth
>>>>>>>>> considering
>>>>>>>>>>> as
>>>>>>>>>>>> well. I would hope, that as part of this effort some thought
>>>>> will
>>>>>>>> be
>>>>>>>>>>> given
>>>>>>>>>>>> to enhancing the developer documentation around the modules
>>>>> would
>>>>>>>> be
>>>>>>>>>>> given
>>>>>>>>>>>> as well.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On July 10, 2019 at 18:15:21, Mike Thomsen (
>>>>> mikerthomsen@gmail.com
>>>>>>>> )
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> I agree. It's very well thought out. One change to consider
>>>>> is
>>>>>>>>>> splitting
>>>>>>>>>>>> the extensions further into two separate repos. One that
>>>>> would
>>>>>>>> serve
>>>>>>>>>> as a
>>>>>>>>>>>> standard library of sorts for other component developers and
>>>>>>>> another
>>>>>>>>>> that
>>>>>>>>>>>> would include everything else. Things like the Record API
>>>>> would go
>>>>>>>>> into
>>>>>>>>>>> the
>>>>>>>>>>>> former so that we could have a more conservative release
>>>>> schedule
>>>>>>>>> going
>>>>>>>>>>>> forward with those components.
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Jul 10, 2019 at 4:17 PM Andy LoPresto <
>>>>>>>> alopresto@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks Kevin, this looks really promising.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Updating the link here as I think the page may have moved:
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/NIFI/NiFi+Project+and+Repository+Restructuring
>>>>>>>>>>>>> <
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/NIFI/NiFi+Project+and+Repository+Restructuring
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Andy LoPresto
>>>>>>>>>>>>> alopresto@apache.org
>>>>>>>>>>>>> alopresto.apache@gmail.com
>>>>>>>>>>>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B
>>>>> 2F7D
>>>>>>>> EF69
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Jul 10, 2019, at 12:08 PM, Kevin Doran <
>>>>> kdoran@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi NiFi Dev Community,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Jeff Storck, Bryan Bende, and I have been collaborating
>>>>> back
>>>>>>>> and
>>>>>>>>>>> forth
>>>>>>>>>>>>>> on a proposal for how to restructure the NiFi source
>>>>> code into
>>>>>>>>>>> smaller
>>>>>>>>>>>>>> Maven projects and repositories based on the discussion
>>>>> that
>>>>>>>> took
>>>>>>>>>>>>>> place awhile back on this thread. I'm reviving this older
>>>>>>>> thread
>>>>>>>>> in
>>>>>>>>>>>>>> order to share that proposal with the community and
>>>>> generate
>>>>>>>>>> farther
>>>>>>>>>>>>>> discussion about at solidifying a destination and a plan
>>>>> for
>>>>>>>> how
>>>>>>>>> to
>>>>>>>>>>>>>> get there.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Specifically, the proposal we've started working on has
>>>>> three
>>>>>>>>>> parts:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. Goals (more or less a summary of the earlier
>>>>> discussion that
>>>>>>>>>> took
>>>>>>>>>>>>>> place on this thread)
>>>>>>>>>>>>>> 2. Proposed end state of the new Maven project and
>>>>> repository
>>>>>>>>>>> structure
>>>>>>>>>>>>>> 3. Proposed approach for how to get from where we are
>>>>> today to
>>>>>>>>> the
>>>>>>>>>>>>>> desired end state
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The proposal is on the Apache NiFi Wiki [1], so that we
>>>>> can all
>>>>>>>>>>>>>> collaborate on it or leave comments there.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/NIFIREG/NiFi+Project+and+Repository+Restructuring
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Kevin, Jeff, and Bryan
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, May 30, 2019 at 1:31 PM Kevin Doran <
>>>>> kdoran@apache.org
>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I am also in favor of splitting the nifi maven project
>>>>> up into
>>>>>>>>>>> smaller
>>>>>>>>>>>>>>> projects with independent release cycles in order to
>>>>> decouple
>>>>>>>>>>>>>>> development at well defined boundaries/interfaces and
>>>>> also to
>>>>>>>>>>>>>>> facilitate code reuse.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> In anticipation of eventually working towards a NiFi
>>>>> 2.0 that
>>>>>>>>>>>>>>> introduces bigger changes for developers and users, I've
>>>>>>>> started
>>>>>>>>>>> work
>>>>>>>>>>>>>>> on a nifi-commons project in which I've extracted out
>>>>> some of
>>>>>>>>> the
>>>>>>>>>>> code
>>>>>>>>>>>>>>> that originally got ported from NiFi -> NiFi Registry,
>>>>> and now
>>>>>>>>>>> exists
>>>>>>>>>>>>>>> as similar code in both projects, into a standalone
>>>>> modular
>>>>>>>>>> library.
>>>>>>>>>>>>>>> That premilinary work is here on my personal github
>>>>> account
>>>>>>>> for
>>>>>>>>>> now
>>>>>>>>>>>>>>> [1].
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> So far, it only contains some security code in a
>>>>> submodule,
>>>>>>>> and
>>>>>>>>>> is a
>>>>>>>>>>>>>>> WIP (more work coming when I have time), but the idea is
>>>>>>>>>>> nifi-commons
>>>>>>>>>>>>>>> could have several libraries/modules and would be
>>>>> released
>>>>>>>>>>>>>>> periodically to use across nifi and registry. If we are
>>>>>>>> talking
>>>>>>>>>>> about
>>>>>>>>>>>>>>> spliting the nifi project into framework and
>>>>> extensions, then
>>>>>>>>>>>>>>> nifi-commons might be a good home for code that needs
>>>>> to be
>>>>>>>>> shared
>>>>>>>>>>>>>>> across those two sub projects as well, such as the
>>>>> nifi-api
>>>>>>>> bits
>>>>>>>>>> Joe
>>>>>>>>>>>>>>> mentioned.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> As part of this larger effort, I would be happy to help
>>>>> get a
>>>>>>>>>>>>>>> nifi-commons repository started in Apache where we can
>>>>> move
>>>>>>>>> shared
>>>>>>>>>>>>>>> code such as nifi-api to prepare for splitting
>>>>> nifi-framework
>>>>>>>>> and
>>>>>>>>>>>>>>> nifi-extensions. It also occurs to me that if
>>>>> nifi-framework
>>>>>>>> and
>>>>>>>>>>>>>>> nifi-extensions are being released independently,
>>>>>>>> nifi-assembly
>>>>>>>>>>> should
>>>>>>>>>>>>>>> probably just become a project that pulls in and
>>>>> assembles the
>>>>>>>>>>> latest
>>>>>>>>>>>>>>> releases of framework and extensions.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Overall, I think this would be beneficial for most of
>>>>> the work
>>>>>>>>>> going
>>>>>>>>>>>>>>> on in Apache NiFi, which would not have to cut across
>>>>> these
>>>>>>>>>>> different
>>>>>>>>>>>>>>> project and therefore would be easier to code, test,
>>>>> build,
>>>>>>>> and
>>>>>>>>>>>>>>> release. However, the level of difficulty will increase
>>>>> for
>>>>>>>>>> changes
>>>>>>>>>>>>>>> that will need to span multiple projects, though those
>>>>> are
>>>>>>>> fewer
>>>>>>>>>> in
>>>>>>>>>>>>>>> number, so overall I think it would be a net win for
>>>>> the dev
>>>>>>>>>>>>>>> community.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> [1] https://github.com/kevdoran/nifi-commons
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Thu, May 30, 2019 at 12:17 PM Andy LoPresto <
>>>>>>>>>>> alopresto@apache.org>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I am a strong +1 on the separation and reducing the
>>>>> build
>>>>>>>> time.
>>>>>>>>>>> With
>>>>>>>>>>>>> that in mind, I think the process I brought up yesterday
>>>>> [1] of
>>>>>>>>>> signing
>>>>>>>>>>>> our
>>>>>>>>>>>>> artifacts with GPG as part of the Maven build is paramount,
>>>>>>>> because
>>>>>>>>>> we
>>>>>>>>>>>>> would now be consuming core code across multiple
>>>>>>>>>> projects/repositories,
>>>>>>>>>>>> so
>>>>>>>>>>>>> there is even less guarantee the code is coming from “us”.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>> 
>> https://lists.apache.org/thread.html/5974971939c539c34148d494f11e8bcf0640c440ce5e7a768ee9db01@%3Cdev.nifi.apache.org%3E
>>>>>>>>>>>>> <
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>> 
>> https://lists.apache.org/thread.html/5974971939c539c34148d494f11e8bcf0640c440ce5e7a768ee9db01@%3Cdev.nifi.apache.org%3E
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Andy LoPresto
>>>>>>>>>>>>>>>> alopresto@apache.org
>>>>>>>>>>>>>>>> alopresto.apache@gmail.com
>>>>>>>>>>>>>>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E
>>>>> F65B 2F7D
>>>>>>>>>> EF69
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On May 30, 2019, at 9:15 AM, Brandon DeVries <
>>>>> brd@jhu.edu>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> In regards to "We 'could' also split out the
>>>>> 'nifi-api'...",
>>>>>>>>>> NiFi
>>>>>>>>>>>> 2.0
>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>> also be a good time to look at more clearly defining
>>>>> the
>>>>>>>>>>> separation
>>>>>>>>>>>>> between
>>>>>>>>>>>>>>>>> the UI and the framework. Where nifi-api is the
>>>>> contract
>>>>>>>>> between
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> extensions and the framework, the NiFi Rest api is the
>>>>>>>>> contract
>>>>>>>>>>>>> between the
>>>>>>>>>>>>>>>>> UI and framework... These pieces could potentially be
>>>>> built
>>>>>>>> /
>>>>>>>>>>>>> deployed /
>>>>>>>>>>>>>>>>> updated independently.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Thu, May 30, 2019 at 11:39 AM Jeff <
>>>>> jtswork@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> In the same category of challenges that Peter
>>>>> pointed out,
>>>>>>>> it
>>>>>>>>>>> might
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> difficult for Travis to build the "framework" and
>>>>>>>>> "extensions"
>>>>>>>>>>>>> projects if
>>>>>>>>>>>>>>>>>> there are changes in a PR that affect both projects.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Is there a good way in Travis to have the
>>>>> workspace/maven
>>>>>>>>> repo
>>>>>>>>>>>> shared
>>>>>>>>>>>>>>>>>> between projects in a single build?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> It's probably always in the direction of the
>>>>> extensions
>>>>>>>>> project
>>>>>>>>>>>>> needing
>>>>>>>>>>>>>>>>>> something new to be added to the framework project
>>>>> rather
>>>>>>>>> than
>>>>>>>>>>> the
>>>>>>>>>>>>> other
>>>>>>>>>>>>>>>>>> way around, but it'll be tricky to get that working
>>>>> right
>>>>>>>> in
>>>>>>>>>>> Travis
>>>>>>>>>>>>> if it's
>>>>>>>>>>>>>>>>>> not possible to set up the Travis build to know it
>>>>> needs to
>>>>>>>>>>> deploy
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> framework project artifacts into a maven repo that
>>>>> the
>>>>>>>>>> extension
>>>>>>>>>>>>> project
>>>>>>>>>>>>>>>>>> will use.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> One way might be to make sure that changes to the
>>>>> framework
>>>>>>>>>>> project
>>>>>>>>>>>>> must be
>>>>>>>>>>>>>>>>>> in master before the extensions project can make use
>>>>> of
>>>>>>>> them,
>>>>>>>>>> but
>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> would require a "default master" build for the
>>>>> framework
>>>>>>>>>> project
>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>> builds master after each commit, and deploys the
>>>>> build
>>>>>>>>>> artifacts
>>>>>>>>>>> to
>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>> persistent maven repo that the extension project
>>>>> builds can
>>>>>>>>>>> access.
>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>>> also makes project-spanning change-sets take longer
>>>>> to
>>>>>>>> review
>>>>>>>>>> and
>>>>>>>>>>>>> get fully
>>>>>>>>>>>>>>>>>> committed to master.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Thu, May 30, 2019 at 11:23 AM Peter Wicks
>>>>> (pwicks) <
>>>>>>>>>>>>> pwicks@micron.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> One more "not awesome" would be that core changes
>>>>> that
>>>>>>>>> affect
>>>>>>>>>>>>> extensions
>>>>>>>>>>>>>>>>>>> will be a little harder to test. If I make a core
>>>>> change
>>>>>>>>> that
>>>>>>>>>>>>> changes the
>>>>>>>>>>>>>>>>>>> signature of an interface/etc... I'll need to do
>>>>> some
>>>>>>>> extra
>>>>>>>>>> work
>>>>>>>>>>>> to
>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>>>>> sure I don't break extensions that use it.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Still worth it, just one more thing to mention.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>>>>>> From: Joe Witt <joewitt@apache.org>
>>>>>>>>>>>>>>>>>>> Sent: Thursday, May 30, 2019 9:19 AM
>>>>>>>>>>>>>>>>>>> To: dev@nifi.apache.org
>>>>>>>>>>>>>>>>>>> Subject: [EXT] [discuss] Splitting NiFi framework
>>>>> and
>>>>>>>>>> extension
>>>>>>>>>>>>> repos and
>>>>>>>>>>>>>>>>>>> releases
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Team,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> We've discussed this a bit over the years in
>>>>> various forms
>>>>>>>>> but
>>>>>>>>>>> it
>>>>>>>>>>>>> again
>>>>>>>>>>>>>>>>>>> seems time to progress this topic and enough has
>>>>> changed I
>>>>>>>>>> think
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> warrant
>>>>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Tensions:
>>>>>>>>>>>>>>>>>>> 1) Our build times take too long. In travis-ci for
>>>>>>>> instance
>>>>>>>>> it
>>>>>>>>>>>>> takes 40
>>>>>>>>>>>>>>>>>>> minutes when it works.
>>>>>>>>>>>>>>>>>>> 2) The number of builds we do has increased. We do
>>>>>>>> us/jp/fr
>>>>>>>>>>> builds
>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> open and oracle JDKs. That is 6 builds.
>>>>>>>>>>>>>>>>>>> 3) We want to add Java 11 support such that one
>>>>> could
>>>>>>>> build
>>>>>>>>>>> with 8
>>>>>>>>>>>>> or 11
>>>>>>>>>>>>>>>>>>> and the above still apply. The becomes 6 builds.
>>>>>>>>>>>>>>>>>>> 4) With the progress in NiFi registry we can now
>>>>> load
>>>>>>>>>> artifacts
>>>>>>>>>>>>> there and
>>>>>>>>>>>>>>>>>>> could pull them into NiFi. And this integration
>>>>> will only
>>>>>>>>> get
>>>>>>>>>>>>> better.
>>>>>>>>>>>>>>>>>>> 5) The NiFi build is too huge and cannot grow any
>>>>> longer
>>>>>>>> or
>>>>>>>>>> else
>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>> cannot
>>>>>>>>>>>>>>>>>>> upload convenience binaries.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> We cannot solve all the things just yet but we can
>>>>> make
>>>>>>>>>>> progress.
>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>>> suggest we split apart the NiFi
>>>>> 'framework/application' in
>>>>>>>>> its
>>>>>>>>>>> own
>>>>>>>>>>>>>>>>>> release
>>>>>>>>>>>>>>>>>>> cycle and code repository from the 'nifi
>>>>> extensions' into
>>>>>>>>> its
>>>>>>>>>>> own
>>>>>>>>>>>>>>>>>>> repository and release cycle. The NiFi release
>>>>> would still
>>>>>>>>>> pull
>>>>>>>>>>> in
>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>> specific set of extension bundles so to our end
>>>>> users at
>>>>>>>>> this
>>>>>>>>>>> time
>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> no change. In the future we could also just stop
>>>>> including
>>>>>>>>> the
>>>>>>>>>>>>> extensions
>>>>>>>>>>>>>>>>>>> in nifi the application and they could be sourced at
>>>>>>>> runtime
>>>>>>>>>> as
>>>>>>>>>>>>> needed
>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>> the registry (call that a NiFi 2.x thing).
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Why does this help?
>>>>>>>>>>>>>>>>>>> - Builds would only take as long as just extensions
>>>>> take
>>>>>>>> or
>>>>>>>>>> just
>>>>>>>>>>>>> core/app
>>>>>>>>>>>>>>>>>>> takes. This reduces time for each change cycle and
>>>>> reduces
>>>>>>>>>> load
>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> travis-ci which runs the same tests over and over
>>>>> and over
>>>>>>>>> for
>>>>>>>>>>>> each
>>>>>>>>>>>>> pull
>>>>>>>>>>>>>>>>>>> request/push regardless of whether it was an
>>>>> extension or
>>>>>>>>>> core.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> - It moves us toward the direction we're heading
>>>>> anyway
>>>>>>>>>> whereby
>>>>>>>>>>>>>>>>>> extensions
>>>>>>>>>>>>>>>>>>> can have their own lifecycle from the framework/app
>>>>>>>> itself.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> How is this not awesome:
>>>>>>>>>>>>>>>>>>> - Doesn't yet solve for the large builds problem. I
>>>>> think
>>>>>>>>>> we'll
>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>> there
>>>>>>>>>>>>>>>>>>> with a NiFi 2.x release which fully leverages
>>>>>>>> nifi-registry
>>>>>>>>>> for
>>>>>>>>>>>>> retrieval
>>>>>>>>>>>>>>>>>>> of all extensions.
>>>>>>>>>>>>>>>>>>> - Adds another 'thing we need to do a release cycle
>>>>> for'.
>>>>>>>>> This
>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> generally unpleasant but it is paid for once a
>>>>> release
>>>>>>>> cycle
>>>>>>>>>> and
>>>>>>>>>>>> it
>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>>>>> allow us to release independently for new cool
>>>>>>>>>> extensions/fixes
>>>>>>>>>>>>> apart
>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>> the framework itself.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Would be great to hear others thoughts if they too
>>>>> feel it
>>>>>>>>> is
>>>>>>>>>>> time
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>>>>> this happen.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>>> Joe
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>> 
>>> 
>>> 
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message