asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Maxon <ima...@uci.edu>
Subject Re: Migration of git repository
Date Mon, 08 Jun 2015 20:20:11 GMT
>My not-totally-thought-out suggestion for problem #2 would be to
not "solve" it at all, and simply state that the tip of Asterix requires
the latest tip of Hyracks to build. That's the way we all develop code on
our local machines anyway, as far as I know. If there are no outside
clients that we have to be concerned about between releases, doesn't this
solve the problem?


This is what we have today. AsterixDB already relies on the Hyracks
snapshot release, and in the developer instructions we tell folks to
install Hyracks first by checking it out and doing 'mvn install'.

The issue with this is that it makes build+test hard, because we are
fighting against maven when we do this. Maven wants to resolve the
dependencies based on version, so when we don't have versions that convey
the actual dependencies that are present (e.g. between git revisions), it
gets messy. One has to somehow use a side channel to convey the true
dependence (like the topic field in Gerrit).

>As a side note, the original proposal to merge the codebases would "solve"
[sweep under the rug] problem #1 for Asterix, at the cost of quite possibly
making it worse for VXQuery.

Right now VXQuery and Pregelix depend on a stable Hyracks version, so
changes in Hyracks master could require changes in VXQuery or Pregelix in
order for them to use later versions. This could happen (or have already
happened) today, without any notice until someone tried to upgrade the
versions. Whether or not Hyracks and AsterixDB happen to live in one git
repository or two has no effect on that. The reason for it is simply that
we have chosen not to test all projects upstream of Hyracks at the
granularity of commits, and we can change that either way.

- Ian



On Mon, Jun 8, 2015 at 12:01 PM, Chris Hillery <chillery@hillery.land>
wrote:

> I think maybe part of the reason we're having a tough time figuring this
> out is that we're conflating two different problems.
>
> 1. We want to ensure that changes to Hyracks don't break Asterix, VXQuery,
> etc.
>
> 2. We fairly often need to make related changes in Hyracks and Asterix that
> "go together", ie, Asterix won't build/work with the new change until it
> can see the corresponding Hyracks change.
>
> Those really are completely different problems and may well need different
> solutions.
>
> IMHO, the first one is "easy" [*] to solve via testing. Either we add
> proper API testing to Hyracks and ensure Asterix/VXQuery/etc only use
> proper APIs, and/or we add Asterix/VXQuery/etc builds and tests to the
> testing jobs on Jenkins.
>
> The second problem is where we get into the trickiness of Maven releases
> vs. Apache releases. This is why I asked about the actual requirements and
> audience. My not-totally-thought-out suggestion for problem #2 would be to
> not "solve" it at all, and simply state that the tip of Asterix requires
> the latest tip of Hyracks to build. That's the way we all develop code on
> our local machines anyway, as far as I know. If there are no outside
> clients that we have to be concerned about between releases, doesn't this
> solve the problem?
>
> Obviously when it comes time to make a real Hyracks (or Asterix) release
> we'll need to do a little extra work to ensure those *released* codebases
> build together. That might mean that we usually need to make Hyracks and
> Asterix releases at the same time, and I don't know whether that's now
> harder to achieve in the incubator world.
>
> (As a side note, the original proposal to merge the codebases would "solve"
> [sweep under the rug] problem #1 for Asterix, at the cost of quite possibly
> making it worse for VXQuery. It would sort of "solve" problem #2 for
> Asterix as well, because it would physically enforce the same tip-tip rule
> I'm proposing above. I still believe that we can solve both problems in
> other strictly superior ways, however.)
>
> Ceej
> aka Chris Hillery
>
> [*] - not actually easy.
>
> On Mon, Jun 8, 2015 at 6:39 AM, Mike Carey <dtabass@gmail.com> wrote:
>
> > All,
> >
> > It feels to me (as one who is completely naive about much of this stuff)
> > like we need two levels of "releases", one level for the outside world
> (the
> > public releases that users might pick up) and a different internal level
> > for the development process (where we essentially want to have
> > tagged/extra-tested checkpoints and want to be able to manage in a
> careful
> > way the cross-dependencies from/to other related development processes X
> -
> > e.g., for X = VXQuery, AsterixDB, and someday Pregelix).  When we do an
> > official signed release of anything, we'd need to do one for the DAG of
> > things - so there might be sync'ed "multireleases" (for Hyacks and then
> for
> > X).  Does that make any sense and/or give anyone more thoughts about how
> we
> > might achieve that...?
> >
> > Cheers,
> > MIke
> >
> >
> >
> > On 6/8/15 2:08 AM, Chris Hillery wrote:
> >
> >> If not, it may be worth taking a step back and asking what exactly the
> >> problem is. I understand the general rule that "we don't want Asterix to
> >> be
> >> broken", but what precisely does that mean? Is it acceptable that the
> tip
> >> of the Asterix source branch is only guaranteed to build against the tip
> >> of
> >> the Hyracks branch, for example? If not, why not? What audience are we
> >> required to keep things working for at the source level, and what
> >> expectations do they have?
> >>
> >> Ceej
> >> aka Chris Hillery
> >>
> >> On Mon, Jun 8, 2015 at 2:06 AM, Chris Hillery <chillery@hillery.land>
> >> wrote:
> >>
> >>  So, if we pushed these not-releases to the Nexus repo running at UCI,
> and
> >>> devs pulled from there in preference to "official" repos, that would
> >>> solve
> >>> the problem?
> >>>
> >>> Ceej
> >>> aka Chris Hillery
> >>>
> >>> On Sun, Jun 7, 2015 at 7:29 PM, Ted Dunning <ted.dunning@gmail.com>
> >>> wrote:
> >>>
> >>>  If it is pushed to any wider audience than roughly the dev@ list, it
> is
> >>>> a release. That definitely includes maven central.  Artifacts in maven
> >>>> are
> >>>> convenience binaries and this not a release but they should be
> >>>> traceable to
> >>>> an exact source release.
> >>>>
> >>>> Sent from my iPhone
> >>>>
> >>>>  On Jun 7, 2015, at 19:10, Till Westmann <tillw@apache.org> wrote:
> >>>>>
> >>>>> Hmm, good point. It doesn’t have to. One question might be if
we can
> >>>>>
> >>>> push it to some maven repository, if it’s not an official release.
> >>>>
> >>>>> But I think that should also be fine as long as we don’t push
it to a
> >>>>>
> >>>> repository that claims to contain official releases.
> >>>>
> >>>>> Some mentor input might be helpful on this as well :)
> >>>>>
> >>>>> Cheers,
> >>>>> Till
> >>>>>
> >>>>>  On Jun 7, 2015, at 6:53 PM, Ildar Absalyamov <
> >>>>>>
> >>>>> ildar.absalyamov@gmail.com> wrote:
> >>>>
> >>>>> Does version bump always mean full-fledged Apache release? We need
> the
> >>>>>>
> >>>>> former just to resolve compile time dependencies.
> >>>>
> >>>>> On Jun 7, 2015, at 18:49, Till Westmann <tillw@apache.org>
wrote:
> >>>>>>>
> >>>>>>> In principle I agree with this, but creating a new release
will be
> a
> >>>>>>>
> >>>>>> little more involved that just running maven, when we do this
at the
> >>>> ASF.
> >>>>
> >>>>> To publish a new release we will have to vet and vote on the release.
> >>>>>>>
> >>>>>> This takes at least 72 hours  in the best case if we’re a
TLP, the
> >>>> first
> >>>> release candidate is great, and have enough people to vote. While
> we’re
> >>>> still in the incubator, releasing will take a little longer as we also
> >>>> have
> >>>> to get enough votes for the release in the incubator.
> >>>>
> >>>>> As I proposed earlier, it would be really good to go through the
full
> >>>>>>>
> >>>>>> release process once, before we decide how to structure our
> processes
> >>>> and
> >>>> infrastructure.
> >>>>
> >>>>> Cheers,
> >>>>>>> Till
> >>>>>>>
> >>>>>>>  On Jun 4, 2015, at 6:37 PM, Ildar Absalyamov <
> >>>>>>>>
> >>>>>>> ildar.absalyamov@gmail.com> wrote:
> >>>>
> >>>>> I am with Chris on repository separation and I think that the
> >>>>>>>>
> >>>>>>> solution to the issue of Hyracks commits breaking Asterix
build is
> >>>> using
> >>>> release Hyracks versions instead of snapshot ones. Yes, that will
> >>>> create a
> >>>> frequent Hyracks releases (we will have to release it each time there
> >>>> is a
> >>>> change which spans both Hyracks & Asterix) and we have abandoned
this
> >>>> practice a while ago, but it seems that’s the only way to separate
> >>>> projects
> >>>> logically.
> >>>>
> >>>>> Here are few examples to clear the picture. In all examples Hyracks
> >>>>>>>>
> >>>>>>> version is 4.5.6-Snapshot, Asterix version is 1.2.3-Snapshot
(but
> it
> >>>> depends on previous release version Hyracks 4.5.5):
> >>>>
> >>>>> 1) The changes span both Asterix & Hyracks.
> >>>>>>>> First make sure that Asterix could depend on Hyracks
> 4.5.6-Snapshot
> >>>>>>>>
> >>>>>>> without API conflicts & switch Asterix dependency to
> 4.5.6-Snapshot.
> >>>>
> >>>>> Submit Gerrit review, once it is done as a part of git-asf script
> >>>>>>>>
> >>>>>>> commit changes, bump Hyracks version to 4.5.6, make Asterix
depend
> >>>> on 4.5.6
> >>>> and bump Hyracks to 4.5.7-Snapshot right after.
> >>>>
> >>>>> 2) The changes are located only in Hyracks. Regular review and
> >>>>>>>>
> >>>>>>> commit (with snapshot version) without any version bump.
> >>>>
> >>>>> 3) The changes are located only in Asterix. Regular review and
> >>>>>>>>
> >>>>>>> commit (with snapshot version) without any version bump.
> >>>>
> >>>>> In this scenario Hyracks commit can never make Asterix build fail
> >>>>>>>>
> >>>>>>> (since it depends on a stable release) and it’s the responsibility
> >>>> of the
> >>>> first person, whose commits spans both repos to make sure that the
> >>>> changes
> >>>> in snapshot Hyracks version are properly merged.
> >>>>
> >>>>> Regarding the Yingyi’s issue with Gerrit topics: could we modify
> >>>>>>>>
> >>>>>>> git-gerrit script so it would submit both Asterix &
Hyracks reviews
> >>>> (granted that the latter is needed), and link them together, setting
> the
> >>>> proper topic? Gerrit seems to have API for changing that, right?
> >>>>
> >>>>> On Jun 4, 2015, at 15:45, Mike Carey <dtabass@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> Just a quick high-level note from our nearest equivalent
of the
> >>>>>>>>>
> >>>>>>>> pointy-haired Dilbert guy (aka me):  What would be nice
is to have
> >>>> Hyracks
> >>>> changes kick off tests of all "supported client projects" - AsterixDB,
> >>>> VXQuery, maybe also Pregelix, IMRU, and possibly others in the future.
> >>>> I
> >>>> don't think we'll ever prevent such downstream things from being
> broken
> >>>> unless we run their tests - so I would suggest that we need a
> mechanism
> >>>> to
> >>>> keep Hyracks changes from being permitted to happen without verifying
> >>>> the
> >>>> ongoing integrity of all "blessed" (priority 1) affected projects....
> >>>> We
> >>>> could have an agreed upon list of such projects and tests for each....
> >>>> It
> >>>> would be nice to have a "quick check" (hello world still works, basics
> >>>> are
> >>>> working) that was synchronously blocking of such changes, and at
> least a
> >>>> daily verification that all's totally well (AFAWK) for them all.
> >>>>
> >>>>> Not sure how this affects the still two-sided discussion...  :-)
> >>>>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>> Mike
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>  On 6/2/15 10:00 AM, Chris Hillery wrote:
> >>>>>>>>>>
> >>>>>>>>>>> On Mon, Jun 1, 2015 at 9:46 PM, Yingyi Bu
<buyingyi@gmail.com>
> >>>>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>
> >>>>> In my opinion,  merging the repository doesn't break the
> >>>>>>>>>>>
> >>>>>>>>>> separation of
> >>>>
> >>>>> hyracks and asterixdb, because the dependencies are controlled by
> >>>>>>>>>>>
> >>>>>>>>>> mvn pom
> >>>>
> >>>>> files.
> >>>>>>>>>>>
> >>>>>>>>>>>  That wasn't the separation I was talking
about. I meant API
> >>>>>>>>>>
> >>>>>>>>> separation. As
> >>>>
> >>>>> it is now, when we make a change to both Asterix and Hyracks, we
> >>>>>>>>>>
> >>>>>>>>> are forced
> >>>>
> >>>>> to consider the API implications, or at least they are put out
> >>>>>>>>>>
> >>>>>>>>> there in a
> >>>>
> >>>>> very clear way that we need to look at. If we merge them, people
> >>>>>>>>>>
> >>>>>>>>> will
> >>>>
> >>>>> (rightly) treat the whole thing as one product, and there will be
> >>>>>>>>>>
> >>>>>>>>> no brakes
> >>>>
> >>>>> on making wide-ranging API changes.
> >>>>>>>>>>
> >>>>>>>>>> (As an aside: I don't trust Maven's pom files
to do a good job
> of
> >>>>>>>>>>
> >>>>>>>>> keeping
> >>>>
> >>>>> the dependency management clean. In fact I trust it to do
> >>>>>>>>>>
> >>>>>>>>> precisely the
> >>>>
> >>>>> opposite, by making it both easier to screw up the dependencies
> >>>>>>>>>>
> >>>>>>>>> and harder
> >>>>
> >>>>> to update them in future.)
> >>>>>>>>>>
> >>>>>>>>>> Again, my point is this: If we truly believe
that Hyracks is a
> >>>>>>>>>>
> >>>>>>>>> re-usable
> >>>>
> >>>>> component, it should be treated as such from source to build to
> >>>>>>>>>>
> >>>>>>>>> delivery.
> >>>>
> >>>>> By merging in Asterix, we are saying that Asterix is "more equal"
> >>>>>>>>>>
> >>>>>>>>> than
> >>>>
> >>>>> others Hyracks clients, to the point that we're tacitly willing
to
> >>>>>>>>>>
> >>>>>>>>> break
> >>>>
> >>>>> those other clients in favor of simplifying Asterix development.
> >>>>>>>>>>
> >>>>>>>>> If that is
> >>>>
> >>>>> a fair and true statement, well, then, sure, let's merge them.
> >>>>>>>>>>
> >>>>>>>>>> 1) It forces those hyracks-only changes to pass
asterixdb
> >>>>>>>>>>
> >>>>>>>>> regression
> >>>>
> >>>>> tests.  Currently hyracks-only change are not verified by
> >>>>>>>>>>>
> >>>>>>>>>> asterixdb tests.
> >>>>
> >>>>> This is a good point, I will admit. However, I think this same
> >>>>>>>>>>
> >>>>>>>>> goal can be
> >>>>
> >>>>> met in other ways. My strong preference would be to create a set
> >>>>>>>>>>
> >>>>>>>>> of true
> >>>>
> >>>>> API tests inside of Hyracks, which both document and test the
> >>>>>>>>>>
> >>>>>>>>> external
> >>>>
> >>>>> Hyracks API. That will make API-breaking changes in future much
> >>>>>>>>>>
> >>>>>>>>> easier to
> >>>>
> >>>>> spot, and also make it clear when Asterix is using internal APIs
> >>>>>>>>>>
> >>>>>>>>> that it
> >>>>
> >>>>> should not.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>  2) On my local machine,  I don't need to always
install hyracks
> >>>>>>>>>>>
> >>>>>>>>>> and then
> >>>>
> >>>>> verify asterixdb from time to time.  Especially, switching
> >>>>>>>>>>>
> >>>>>>>>>> branches seems
> >>>>
> >>>>> painful because the installed hyracks snapshot is overwritten
> >>>>>>>>>>>
> >>>>>>>>>> from time to
> >>>>
> >>>>> time.
> >>>>>>>>>>>
> >>>>>>>>>>>  I haven't tried working on multiple Hyracks
branches at the
> same
> >>>>>>>>>>
> >>>>>>>>> time, so I
> >>>>
> >>>>> haven't experienced this. This seems like a working method error,
> >>>>>>>>>>
> >>>>>>>>> though.
> >>>>
> >>>>> If you're working with two things that are "the same version"
> >>>>>>>>>>
> >>>>>>>>> (even if
> >>>>
> >>>>> that's a snapshot version), you'll need to use separate Maven
> >>>>>>>>>>
> >>>>>>>>> repositories
> >>>>
> >>>>> to install them. In fact, merging the two git repositories would
> >>>>>>>>>>
> >>>>>>>>> do nothing
> >>>>
> >>>>> to fix this problem, will it? If the proposal is to put the two
> >>>>>>>>>>
> >>>>>>>>> source
> >>>>
> >>>>> repositories in the same git repo but otherwise leave them
> >>>>>>>>>>
> >>>>>>>>> untouched, then
> >>>>
> >>>>> nothing would change in the build process. It's possible I'm
> >>>>>>>>>>
> >>>>>>>>> missing
> >>>>
> >>>>> something there, though.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>  3) I only need to make one code review request
and one jenkins
> >>>>>>>>>>>
> >>>>>>>>>> job.
> >>>>
> >>>>> Currently I need to manually change the topic of my asterixdb
> >>>>>>>>>>>
> >>>>>>>>>> gerrit CL
> >>>>
> >>>>> every time before I update my hyracks CL, and then manually
> >>>>>>>>>>>
> >>>>>>>>>> schedule
> >>>>
> >>>>> jenkins to run a new asterixdb job.  If I forget to schedule the
> >>>>>>>>>>>
> >>>>>>>>>> jenkins
> >>>>
> >>>>> job, the asterixdb CL is still shown to be "verified by jenkins".
> >>>>>>>>>>>
> >>>>>>>>>>>  This is a problem, but it's a problem in
commit validation,
> not
> >>>>>>>>>> in
> >>>>>>>>>>
> >>>>>>>>> the
> >>>>
> >>>>> source. Modifying the source to work around these issues is still
> >>>>>>>>>>
> >>>>>>>>> a bad
> >>>>
> >>>>> idea IMHO.
> >>>>>>>>>>
> >>>>>>>>>> The "change-topic" issue could be fixed with
a bit of
> development
> >>>>>>>>>>
> >>>>>>>>> work
> >>>>
> >>>>> (have the topic point to a change, rather than a specific patchset
> >>>>>>>>>>
> >>>>>>>>> on the
> >>>>
> >>>>> change, so you only need to set it once, for instance).
> >>>>>>>>>>
> >>>>>>>>>> As for manually scheduling Asterix Jenkins jobs,
that sounds
> like
> >>>>>>>>>>
> >>>>>>>>> it's only
> >>>>
> >>>>> a problem where your Hyracks change breaks an existing public API.
> >>>>>>>>>>
> >>>>>>>>> That
> >>>>
> >>>>> would be obviated by having true API testing inside of Hyracks,
> >>>>>>>>>>
> >>>>>>>>> which is
> >>>>
> >>>>> something that we should have regardless of any decisions about
> >>>>>>>>>>
> >>>>>>>>> source
> >>>>
> >>>>> locations.
> >>>>>>>>>>
> >>>>>>>>>> In summary / repeating myself again: yes, we
have some problems
> >>>>>>>>>>
> >>>>>>>>> because
> >>>>
> >>>>> Hyracks and Asterix are in seperate repositories. But those
> >>>>>>>>>>
> >>>>>>>>> problems are
> >>>>
> >>>>> pointing out true issues with our development and processes.
> >>>>>>>>>>
> >>>>>>>>> Merging the
> >>>>
> >>>>> repositories isn't fixing those problems, it's sweeping them under
> >>>>>>>>>>
> >>>>>>>>> the rug.
> >>>>
> >>>>> Long term we would be much better off to identify, isolate, and
> >>>>>>>>>>
> >>>>>>>>> fix the
> >>>>
> >>>>> problems themselves.
> >>>>>>>>>>
> >>>>>>>>>> Ceej
> >>>>>>>>>> aka Chris Hillery
> >>>>>>>>>>
> >>>>>>>>>>  Best regards,
> >>>>>>>> Ildar
> >>>>>>>>
> >>>>>>>>  Best regards,
> >>>>>> Ildar
> >>>>>>
> >>>>>>
> >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message