asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Jacobs <sjaco...@ucr.edu>
Subject Re: Migration of git repository
Date Wed, 03 Jun 2015 17:30:57 GMT
Here is my $.01 (offered at a discount):


*"That wasn't the separation I was talking about. I meant API separation.
As*
*it is now, when we make a change to both Asterix and Hyracks, we are
forced*
*to consider the API implications, or at least they are put out there in a*
*very clear way that we need to look at. If we merge them, people will*
*(rightly) treat the whole thing as one product, and there will be no
brakes*
*on making wide-ranging API changes." - Chris*

-The way it is now doesn't prevent this at all. I think most of us have
both projects
in one Eclipse workspace and switch seamlessly between them. It's already
up to the individual to pay attention to cross-code implications


*"This is a good point, I will admit. However, I think this same goal can
be*
*met in other ways. My strong preference would be to create a set of true*
*API tests inside of Hyracks, which both document and test the external*
*Hyracks API. That will make API-breaking changes in future much easier to*
*spot, and also make it clear when Asterix is using internal APIs that it*
*should not...*

*As for manually scheduling Asterix Jenkins jobs, that sounds like it's
only*
*a problem where your Hyracks change breaks an existing public API. That*
*would be obviated by having true API testing inside of Hyracks, which is*
*something that we should have regardless of any decisions about source*
*locations." -Chris*

-I agree with Yingyi's comments here that adding API tests to Hyracks won't
prevent Hyracks
changes from causing other issues to appear in Asterix. There are lots of
demons in the machine
that can come out that are unrelated to APIs This brings me to my main
comment:

Right now Till, Preston, and I are all VXQuery committers who can also
commit to Hyracks. This isn't
a big issue since we are all Asterix committers as well. But how should
this work in general? Suppose
a newcomer to VXQuery (or another Hyracks-dependent project) reaches
committer status. After some time on
the project, she makes a change that requires changes to Hyracks, so she is
given committer status
to Hyracks. What is our policy for her changes (as they could break
Asterix)?
In summary:
Are you required to test against Asterix when you commit to Hyracks?

I feel like the answer to this question will flush out whether or not we
can really consider Asterix and Hyracks
as completely seperate entities.

Steven

Side note: This wouldn't be so much of an issue if all changes to Hyracks
required a version change. But this is not the case as it is now.

On Wed, Jun 3, 2015 at 12:29 AM, Mike Carey <dtabass@gmail.com> wrote:

> So is there some way to fix change-topic and other user experience issues
> that separation "causes"?
> I.e, could we have our cake (separated code bases for multiple Hyracks
> consumers) and eat it too in
> AsterixDB (not feeling added pain, but having a fairly seemless experience
> if you do both-level stuff)?
>
>
> On 6/2/15 11:59 PM, Till Westmann wrote:
>
>> On Jun 2, 2015, at 22:45, Yingyi Bu <buyingyi@gmail.com> wrote:
>>>
>>>  I haven't tried working on multiple Hyracks branches at the same time,
>>>>> so I haven't experienced this. This seems like a working method error,
>>>>> though. If >>you're working with two things that are "the same
version"
>>>>> (even if that's a snapshot version), you'll need to use separate Maven
>>>>> repositories to install >>them. In fact, merging the two git repositories
>>>>> would do nothing to fix this problem, will it? If the proposal is to
put
>>>>> the two source repositories in the >>same git repo but otherwise
leave them
>>>>> untouched, then nothing would change in the build process. It's possible
>>>>> I'm missing something there, though.
>>>>>
>>>> Is there a way to use multiple mvn repositories on the same machine?
>>>  I used to think mvn always installs artifacts to the directory
>>> ~/.m2/repository.
>>> I guess we just need to have a root-level pom and leave hyracks and
>>> asterixdb untouched.  Then, a single root-level "mvn package ..." will
>>> build everything without requiring installing hyracks first.  It's just
>>> like what we currently do for hyracks and algebricks.  Then, builds/tests
>>> do not leave side-effects in ~/.m2/repository.
>>>
>> Great question! I just looked into this a bit (but I didn't try it) and
>> the docs seem to suggest that
>> a) you should be able to specify the local repository in a settings.xml
>> and that
>> b) you should be able to specify the settings.xml on the maven command
>> line.
>> So it should be possible to do that - and with some shell magic I think
>> that it should even be possible to do that in a largely invisible way.
>>
>>  As for manually scheduling Asterix Jenkins jobs, that sounds like it's
>>>>> only a problem where your Hyracks change breaks an existing public API.
>>>>> That >>would be obviated by having true API testing inside of Hyracks,
>>>>> which is something that we should have regardless of any decisions about
>>>>> source >>locations.
>>>>>
>>>> I agree that's the right software engineering way. Going forward, we do
>>> need to add more unit tests in hyracks and asterixdb. But considering the
>>> resource constraints, I'm not sure whether (or when) we can have a complete
>>> API test suite for hyracks/algebricks:
>>> 1)  both hyracks and algebricks public APIs allow an arbitrary input DAG
>>> (a logical plan or a hyracks job).  It's hard to enumerate all
>>> possibilities in hyracks/algebricks tests.  My experience is that when we
>>> see a broken AQL query,  we fix it in both hyracks/asterixdb codebases,
>>> and verify it with the AQL query. In those cases,  there might be no need
>>> to have yet-another verbose hyracks/algebricks test.
>>> 2)  even if we have a comprehensive test suite for hyracks,  I'm not
>>> sure whether it can guarantee to pass asterixdb tests because the current
>>> asterixdb test suite covers a lot of edge cases in the hyracks runtime,
>>> LSM, and algebricks.
>>>
>> One way to use existing clients as tests for Hyracks could be to set up a
>> system that runs the tests of the existing versions of the clients against
>> a new version of Hyracks - ideally all client isolated from each other and
>> in parallel to keep turn around times low.
>> Does that sound feasible?
>>
>> Cheers,
>> Till
>>
>>  Anyway, if the repositories have to be separated, it would be nice that
>>> the "change-topic" issue can be fixed.
>>>
>>> Best,
>>> Yingyi
>>>
>>>
>>>  On Tue, Jun 2, 2015 at 10:00 AM, Chris Hillery <chillery@lambda.nu>
>>>> wrote:
>>>>
>>>>> On Mon, Jun 1, 2015 at 9:46 PM, Yingyi Bu <buyingyi@gmail.com>
wrote:
>>>>> In my opinion,  merging the repository doesn't break the separation of
>>>>> hyracks and asterixdb, because the dependencies are controlled by mvn
pom
>>>>> files.
>>>>>
>>>> That wasn't the separation I was talking about. I meant API separation.
>>>> As it is now, when we make a change to both Asterix and Hyracks, we are
>>>> forced to consider the API implications, or at least they are put out there
>>>> in a very clear way that we need to look at. If we merge them, people will
>>>> (rightly) treat the whole thing as one product, and there will be no brakes
>>>> on making wide-ranging API changes.
>>>>
>>>> (As an aside: I don't trust Maven's pom files to do a good job of
>>>> keeping the dependency management clean. In fact I trust it to do precisely
>>>> the opposite, by making it both easier to screw up the dependencies and
>>>> harder to update them in future.)
>>>>
>>>> Again, my point is this: If we truly believe that Hyracks is a
>>>> re-usable component, it should be treated as such from source to build to
>>>> delivery. By merging in Asterix, we are saying that Asterix is "more equal"
>>>> than others Hyracks clients, to the point that we're tacitly willing to
>>>> break those other clients in favor of simplifying Asterix development. If
>>>> that is a fair and true statement, well, then, sure, let's merge them.
>>>>
>>>>  1) It forces those hyracks-only changes to pass asterixdb regression
>>>>> tests.  Currently hyracks-only change are not verified by asterixdb tests.
>>>>>
>>>> This is a good point, I will admit. However, I think this same goal can
>>>> be met in other ways. My strong preference would be to create a set of true
>>>> API tests inside of Hyracks, which both document and test the external
>>>> Hyracks API. That will make API-breaking changes in future much easier to
>>>> spot, and also make it clear when Asterix is using internal APIs that it
>>>> should not.
>>>>
>>>>
>>>>> 2) On my local machine,  I don't need to always install hyracks and
>>>>> then verify asterixdb from time to time.  Especially, switching branches
>>>>> seems painful because the installed hyracks snapshot is overwritten from
>>>>> time to time.
>>>>>
>>>> I haven't tried working on multiple Hyracks branches at the same time,
>>>> so I haven't experienced this. This seems like a working method error,
>>>> though. If you're working with two things that are "the same version" (even
>>>> if that's a snapshot version), you'll need to use separate Maven
>>>> repositories to install them. In fact, merging the two git repositories
>>>> would do nothing to fix this problem, will it? If the proposal is to put
>>>> the two source repositories in the same git repo but otherwise leave them
>>>> untouched, then nothing would change in the build process. It's possible
>>>> I'm missing something there, though.
>>>>
>>>>
>>>>> 3) I only need to make one code review request and one jenkins job.
>>>>> Currently I need to manually change the topic of my asterixdb gerrit
CL
>>>>> every time before I update my hyracks CL, and then manually schedule
>>>>> jenkins to run a new asterixdb job.  If I forget to schedule the jenkins
>>>>> job, the asterixdb CL is still shown to be "verified by jenkins".
>>>>>
>>>> This is a problem, but it's a problem in commit validation, not in the
>>>> source. Modifying the source to work around these issues is still a bad
>>>> idea IMHO.
>>>>
>>>> The "change-topic" issue could be fixed with a bit of development work
>>>> (have the topic point to a change, rather than a specific patchset on the
>>>> change, so you only need to set it once, for instance).
>>>>
>>>> As for manually scheduling Asterix Jenkins jobs, that sounds like it's
>>>> only a problem where your Hyracks change breaks an existing public API.
>>>> That would be obviated by having true API testing inside of Hyracks, which
>>>> is something that we should have regardless of any decisions about source
>>>> locations.
>>>>
>>>> In summary / repeating myself again: yes, we have some problems because
>>>> Hyracks and Asterix are in seperate repositories. But those problems are
>>>> pointing out true issues with our development and processes. Merging the
>>>> repositories isn't fixing those problems, it's sweeping them under the rug.
>>>> Long term we would be much better off to identify, isolate, and fix the
>>>> problems themselves.
>>>>
>>>> Ceej
>>>> aka Chris Hillery
>>>>
>>>>
>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message