asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Carey <dtab...@gmail.com>
Subject Re: Migration of git repository
Date Mon, 08 Jun 2015 21:10:42 GMT
Agreed.  Hyracks doesn't have a narrow testable API, I fear, so that'd 
be Mission Impossible.  :-)

On 6/8/15 12:15 PM, Steven Jacobs wrote:
> *IMHO, the first one is "easy" [*] to solve via testing. Either we
> addproper API testing to Hyracks and ensure Asterix/VXQuery/etc only
> useproper APIs, and/or we add Asterix/VXQuery/etc builds and tests to
> thetesting jobs on Jenkins.*
> -We should be going in the direction of the latter here. As discussed,
> there are more issues that we see than simple API breakage.
>
> Steven
>
> On Mon, Jun 8, 2015 at 12:01 PM, Chris Hillery <chillery@hillery.land>
> wrote:
>
>> I think maybe part of the reason we're having a tough time figuring this
>> out is that we're conflating two different problems.
>>
>> 1. We want to ensure that changes to Hyracks don't break Asterix, VXQuery,
>> etc.
>>
>> 2. We fairly often need to make related changes in Hyracks and Asterix that
>> "go together", ie, Asterix won't build/work with the new change until it
>> can see the corresponding Hyracks change.
>>
>> Those really are completely different problems and may well need different
>> solutions.
>>
>> IMHO, the first one is "easy" [*] to solve via testing. Either we add
>> proper API testing to Hyracks and ensure Asterix/VXQuery/etc only use
>> proper APIs, and/or we add Asterix/VXQuery/etc builds and tests to the
>> testing jobs on Jenkins.
>>
>> The second problem is where we get into the trickiness of Maven releases
>> vs. Apache releases. This is why I asked about the actual requirements and
>> audience. My not-totally-thought-out suggestion for problem #2 would be to
>> not "solve" it at all, and simply state that the tip of Asterix requires
>> the latest tip of Hyracks to build. That's the way we all develop code on
>> our local machines anyway, as far as I know. If there are no outside
>> clients that we have to be concerned about between releases, doesn't this
>> solve the problem?
>>
>> Obviously when it comes time to make a real Hyracks (or Asterix) release
>> we'll need to do a little extra work to ensure those *released* codebases
>> build together. That might mean that we usually need to make Hyracks and
>> Asterix releases at the same time, and I don't know whether that's now
>> harder to achieve in the incubator world.
>>
>> (As a side note, the original proposal to merge the codebases would "solve"
>> [sweep under the rug] problem #1 for Asterix, at the cost of quite possibly
>> making it worse for VXQuery. It would sort of "solve" problem #2 for
>> Asterix as well, because it would physically enforce the same tip-tip rule
>> I'm proposing above. I still believe that we can solve both problems in
>> other strictly superior ways, however.)
>>
>> Ceej
>> aka Chris Hillery
>>
>> [*] - not actually easy.
>>
>> On Mon, Jun 8, 2015 at 6:39 AM, Mike Carey <dtabass@gmail.com> wrote:
>>
>>> All,
>>>
>>> It feels to me (as one who is completely naive about much of this stuff)
>>> like we need two levels of "releases", one level for the outside world
>> (the
>>> public releases that users might pick up) and a different internal level
>>> for the development process (where we essentially want to have
>>> tagged/extra-tested checkpoints and want to be able to manage in a
>> careful
>>> way the cross-dependencies from/to other related development processes X
>> -
>>> e.g., for X = VXQuery, AsterixDB, and someday Pregelix).  When we do an
>>> official signed release of anything, we'd need to do one for the DAG of
>>> things - so there might be sync'ed "multireleases" (for Hyacks and then
>> for
>>> X).  Does that make any sense and/or give anyone more thoughts about how
>> we
>>> might achieve that...?
>>>
>>> Cheers,
>>> MIke
>>>
>>>
>>>
>>> On 6/8/15 2:08 AM, Chris Hillery wrote:
>>>
>>>> If not, it may be worth taking a step back and asking what exactly the
>>>> problem is. I understand the general rule that "we don't want Asterix to
>>>> be
>>>> broken", but what precisely does that mean? Is it acceptable that the
>> tip
>>>> of the Asterix source branch is only guaranteed to build against the tip
>>>> of
>>>> the Hyracks branch, for example? If not, why not? What audience are we
>>>> required to keep things working for at the source level, and what
>>>> expectations do they have?
>>>>
>>>> Ceej
>>>> aka Chris Hillery
>>>>
>>>> On Mon, Jun 8, 2015 at 2:06 AM, Chris Hillery <chillery@hillery.land>
>>>> wrote:
>>>>
>>>>   So, if we pushed these not-releases to the Nexus repo running at UCI,
>> and
>>>>> devs pulled from there in preference to "official" repos, that would
>>>>> solve
>>>>> the problem?
>>>>>
>>>>> Ceej
>>>>> aka Chris Hillery
>>>>>
>>>>> On Sun, Jun 7, 2015 at 7:29 PM, Ted Dunning <ted.dunning@gmail.com>
>>>>> wrote:
>>>>>
>>>>>   If it is pushed to any wider audience than roughly the dev@ list, it
>> is
>>>>>> a release. That definitely includes maven central.  Artifacts in
maven
>>>>>> are
>>>>>> convenience binaries and this not a release but they should be
>>>>>> traceable to
>>>>>> an exact source release.
>>>>>>
>>>>>> Sent from my iPhone
>>>>>>
>>>>>>   On Jun 7, 2015, at 19:10, Till Westmann <tillw@apache.org>
wrote:
>>>>>>> Hmm, good point. It doesn’t have to. One question might be
if we can
>>>>>>>
>>>>>> push it to some maven repository, if it’s not an official release.
>>>>>>
>>>>>>> But I think that should also be fine as long as we don’t push
it to a
>>>>>>>
>>>>>> repository that claims to contain official releases.
>>>>>>
>>>>>>> Some mentor input might be helpful on this as well :)
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Till
>>>>>>>
>>>>>>>   On Jun 7, 2015, at 6:53 PM, Ildar Absalyamov <
>>>>>>> ildar.absalyamov@gmail.com> wrote:
>>>>>>> Does version bump always mean full-fledged Apache release? We
need
>> the
>>>>>>> former just to resolve compile time dependencies.
>>>>>>> On Jun 7, 2015, at 18:49, Till Westmann <tillw@apache.org>
wrote:
>>>>>>>>> In principle I agree with this, but creating a new release
will be
>> a
>>>>>>>> little more involved that just running maven, when we do
this at the
>>>>>> ASF.
>>>>>>
>>>>>>> To publish a new release we will have to vet and vote on the
release.
>>>>>>>> This takes at least 72 hours  in the best case if we’re
a TLP, the
>>>>>> first
>>>>>> release candidate is great, and have enough people to vote. While
>> we’re
>>>>>> still in the incubator, releasing will take a little longer as we
also
>>>>>> have
>>>>>> to get enough votes for the release in the incubator.
>>>>>>
>>>>>>> As I proposed earlier, it would be really good to go through
the full
>>>>>>>> release process once, before we decide how to structure our
>> processes
>>>>>> and
>>>>>> infrastructure.
>>>>>>
>>>>>>> Cheers,
>>>>>>>>> Till
>>>>>>>>>
>>>>>>>>>   On Jun 4, 2015, at 6:37 PM, Ildar Absalyamov <
>>>>>>>>> ildar.absalyamov@gmail.com> wrote:
>>>>>>> I am with Chris on repository separation and I think that the
>>>>>>>>> solution to the issue of Hyracks commits breaking Asterix
build is
>>>>>> using
>>>>>> release Hyracks versions instead of snapshot ones. Yes, that will
>>>>>> create a
>>>>>> frequent Hyracks releases (we will have to release it each time there
>>>>>> is a
>>>>>> change which spans both Hyracks & Asterix) and we have abandoned
this
>>>>>> practice a while ago, but it seems that’s the only way to separate
>>>>>> projects
>>>>>> logically.
>>>>>>
>>>>>>> Here are few examples to clear the picture. In all examples Hyracks
>>>>>>>>> version is 4.5.6-Snapshot, Asterix version is 1.2.3-Snapshot
(but
>> it
>>>>>> depends on previous release version Hyracks 4.5.5):
>>>>>>
>>>>>>> 1) The changes span both Asterix & Hyracks.
>>>>>>>>>> First make sure that Asterix could depend on Hyracks
>> 4.5.6-Snapshot
>>>>>>>>> without API conflicts & switch Asterix dependency
to
>> 4.5.6-Snapshot.
>>>>>>> Submit Gerrit review, once it is done as a part of git-asf script
>>>>>>>>> commit changes, bump Hyracks version to 4.5.6, make Asterix
depend
>>>>>> on 4.5.6
>>>>>> and bump Hyracks to 4.5.7-Snapshot right after.
>>>>>>
>>>>>>> 2) The changes are located only in Hyracks. Regular review and
>>>>>>>>> commit (with snapshot version) without any version bump.
>>>>>>> 3) The changes are located only in Asterix. Regular review and
>>>>>>>>> commit (with snapshot version) without any version bump.
>>>>>>> In this scenario Hyracks commit can never make Asterix build
fail
>>>>>>>>> (since it depends on a stable release) and it’s the
responsibility
>>>>>> of the
>>>>>> first person, whose commits spans both repos to make sure that the
>>>>>> changes
>>>>>> in snapshot Hyracks version are properly merged.
>>>>>>
>>>>>>> Regarding the Yingyi’s issue with Gerrit topics: could we modify
>>>>>>>>> git-gerrit script so it would submit both Asterix &
Hyracks reviews
>>>>>> (granted that the latter is needed), and link them together, setting
>> the
>>>>>> proper topic? Gerrit seems to have API for changing that, right?
>>>>>>
>>>>>>> On Jun 4, 2015, at 15:45, Mike Carey <dtabass@gmail.com>
wrote:
>>>>>>>>>>> Just a quick high-level note from our nearest
equivalent of the
>>>>>>>>>>>
>>>>>>>>>> pointy-haired Dilbert guy (aka me):  What would be
nice is to have
>>>>>> Hyracks
>>>>>> changes kick off tests of all "supported client projects" - AsterixDB,
>>>>>> VXQuery, maybe also Pregelix, IMRU, and possibly others in the future.
>>>>>> I
>>>>>> don't think we'll ever prevent such downstream things from being
>> broken
>>>>>> unless we run their tests - so I would suggest that we need a
>> mechanism
>>>>>> to
>>>>>> keep Hyracks changes from being permitted to happen without verifying
>>>>>> the
>>>>>> ongoing integrity of all "blessed" (priority 1) affected projects....
>>>>>> We
>>>>>> could have an agreed upon list of such projects and tests for each....
>>>>>> It
>>>>>> would be nice to have a "quick check" (hello world still works, basics
>>>>>> are
>>>>>> working) that was synchronously blocking of such changes, and at
>> least a
>>>>>> daily verification that all's totally well (AFAWK) for them all.
>>>>>>
>>>>>>> Not sure how this affects the still two-sided discussion... 
:-)
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Mike
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   On 6/2/15 10:00 AM, Chris Hillery wrote:
>>>>>>>>>>>>> On Mon, Jun 1, 2015 at 9:46 PM, Yingyi
Bu <buyingyi@gmail.com>
>>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>> In my opinion,  merging the repository doesn't break the
>>>>>>>>>>>> separation of
>>>>>>> hyracks and asterixdb, because the dependencies are controlled
by
>>>>>>>>>>>> mvn pom
>>>>>>> files.
>>>>>>>>>>>>>   That wasn't the separation I was talking
about. I meant API
>>>>>>>>>>> separation. As
>>>>>>> it is now, when we make a change to both Asterix and Hyracks,
we
>>>>>>>>>>> are forced
>>>>>>> to consider the API implications, or at least they are put out
>>>>>>>>>>> there in a
>>>>>>> very clear way that we need to look at. If we merge them, people
>>>>>>>>>>> will
>>>>>>> (rightly) treat the whole thing as one product, and there will
be
>>>>>>>>>>> no brakes
>>>>>>> on making wide-ranging API changes.
>>>>>>>>>>>> (As an aside: I don't trust Maven's pom files
to do a good job
>> of
>>>>>>>>>>> keeping
>>>>>>> the dependency management clean. In fact I trust it to do
>>>>>>>>>>> precisely the
>>>>>>> opposite, by making it both easier to screw up the dependencies
>>>>>>>>>>> and harder
>>>>>>> to update them in future.)
>>>>>>>>>>>> Again, my point is this: If we truly believe
that Hyracks is a
>>>>>>>>>>>>
>>>>>>>>>>> re-usable
>>>>>>> component, it should be treated as such from source to build
to
>>>>>>>>>>> delivery.
>>>>>>> By merging in Asterix, we are saying that Asterix is "more equal"
>>>>>>>>>>> than
>>>>>>> others Hyracks clients, to the point that we're tacitly willing
to
>>>>>>>>>>> break
>>>>>>> those other clients in favor of simplifying Asterix development.
>>>>>>>>>>> If that is
>>>>>>> a fair and true statement, well, then, sure, let's merge them.
>>>>>>>>>>>> 1) It forces those hyracks-only changes to
pass asterixdb
>>>>>>>>>>>>
>>>>>>>>>>> regression
>>>>>>> tests.  Currently hyracks-only change are not verified by
>>>>>>>>>>>> asterixdb tests.
>>>>>>> This is a good point, I will admit. However, I think this same
>>>>>>>>>>> goal can be
>>>>>>> met in other ways. My strong preference would be to create a
set
>>>>>>>>>>> of true
>>>>>>> API tests inside of Hyracks, which both document and test the
>>>>>>>>>>> external
>>>>>>> Hyracks API. That will make API-breaking changes in future much
>>>>>>>>>>> easier to
>>>>>>> spot, and also make it clear when Asterix is using internal APIs
>>>>>>>>>>> that it
>>>>>>> should not.
>>>>>>>>>>>>
>>>>>>>>>>>>   2) On my local machine,  I don't need to
always install hyracks
>>>>>>>>>>>> and then
>>>>>>> verify asterixdb from time to time.  Especially, switching
>>>>>>>>>>>> branches seems
>>>>>>> painful because the installed hyracks snapshot is overwritten
>>>>>>>>>>>> from time to
>>>>>>> time.
>>>>>>>>>>>>>   I haven't tried working on multiple
Hyracks branches at the
>> same
>>>>>>>>>>> time, so I
>>>>>>> haven't experienced this. This seems like a working method error,
>>>>>>>>>>> though.
>>>>>>> If you're working with two things that are "the same version"
>>>>>>>>>>> (even if
>>>>>>> that's a snapshot version), you'll need to use separate Maven
>>>>>>>>>>> repositories
>>>>>>> to install them. In fact, merging the two git repositories would
>>>>>>>>>>> do nothing
>>>>>>> to fix this problem, will it? If the proposal is to put the two
>>>>>>>>>>> source
>>>>>>> repositories in the same git repo but otherwise leave them
>>>>>>>>>>> untouched, then
>>>>>>> nothing would change in the build process. It's possible I'm
>>>>>>>>>>> missing
>>>>>>> something there, though.
>>>>>>>>>>>>
>>>>>>>>>>>>   3) I only need to make one code review
request and one jenkins
>>>>>>>>>>>> job.
>>>>>>> Currently I need to manually change the topic of my asterixdb
>>>>>>>>>>>> gerrit CL
>>>>>>> every time before I update my hyracks CL, and then manually
>>>>>>>>>>>> schedule
>>>>>>> jenkins to run a new asterixdb job.  If I forget to schedule
the
>>>>>>>>>>>> jenkins
>>>>>>> job, the asterixdb CL is still shown to be "verified by jenkins".
>>>>>>>>>>>>>   This is a problem, but it's a problem
in commit validation,
>> not
>>>>>>>>>>>> in
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>> source. Modifying the source to work around these issues is still
>>>>>>>>>>> a bad
>>>>>>> idea IMHO.
>>>>>>>>>>>> The "change-topic" issue could be fixed with
a bit of
>> development
>>>>>>>>>>> work
>>>>>>> (have the topic point to a change, rather than a specific patchset
>>>>>>>>>>> on the
>>>>>>> change, so you only need to set it once, for instance).
>>>>>>>>>>>> As for manually scheduling Asterix Jenkins
jobs, that sounds
>> like
>>>>>>>>>>> it's only
>>>>>>> a problem where your Hyracks change breaks an existing public
API.
>>>>>>>>>>> That
>>>>>>> would be obviated by having true API testing inside of Hyracks,
>>>>>>>>>>> which is
>>>>>>> something that we should have regardless of any decisions about
>>>>>>>>>>> source
>>>>>>> locations.
>>>>>>>>>>>> In summary / repeating myself again: yes,
we have some problems
>>>>>>>>>>>>
>>>>>>>>>>> because
>>>>>>> Hyracks and Asterix are in seperate repositories. But those
>>>>>>>>>>> problems are
>>>>>>> pointing out true issues with our development and processes.
>>>>>>>>>>> Merging the
>>>>>>> repositories isn't fixing those problems, it's sweeping them
under
>>>>>>>>>>> the rug.
>>>>>>> Long term we would be much better off to identify, isolate, and
>>>>>>>>>>> fix the
>>>>>>> problems themselves.
>>>>>>>>>>>> Ceej
>>>>>>>>>>>> aka Chris Hillery
>>>>>>>>>>>>
>>>>>>>>>>>>   Best regards,
>>>>>>>>>> Ildar
>>>>>>>>>>
>>>>>>>>>>   Best regards,
>>>>>>>> Ildar
>>>>>>>>
>>>>>>>>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message