asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Maxon <ima...@uci.edu>
Subject Re: Migration of git repository
Date Tue, 02 Jun 2015 16:05:52 GMT
I'm in favor of merging them as well. Keeping the git repositories separate
doesn't enforce any kind of architectural separation, it just makes build +
test more complex. Nearly every major change is using the topic field hack
by this point.
I think the only downside is that the tests will take longer, but that may
need to be revisited anyway (in Hyracks, the index stress tests- especially
for inverted indexes- take far too long).

Another .02¢ :)

- Ian

On Mon, Jun 1, 2015 at 9:46 PM, Yingyi Bu <buyingyi@gmail.com> wrote:

> Chris,
>
> Thanks for the input!!
>
> >>1. If we're serious about Hyracks being a re-usable component of other
> products, it makes sense to dogfood that in Asterixdb. If there are
> problems ?>>keeping Hyracks separate from Asterix or keeping Hyracks with
> clean interfaces, this forces us to address them.
>
> In my opinion,  merging the repository doesn't break the separation of
> hyracks and asterixdb, because the dependencies are controlled by mvn pom
> files. We just make the code physically live together under the root
> directory, one is hyracks as it is and the other is asterixdb as it is.
> For example, Spark lives together with all the things on top of it and that
> doesn't seem to prevent its reusability. Hadoop lives together with
> Hive/Pig/Zookeeper in the same repo until year 2010 when it is very stable.
>
> Currently almost all my changes are spanning hyracks and asterixdb.  I
> believe many people also suffer from that.  Merging them together will have
> the following benefits:
> 1) It forces those hyracks-only changes to pass asterixdb regression
> tests.  Currently hyracks-only change are not verified by asterixdb tests.
> 2) On my local machine,  I don't need to always install hyracks and then
> verify asterixdb from time to time.  Especially, switching branches seems
> painful because the installed hyracks snapshot is overwritten from time to
> time.
> 3) I only need to make one code review request and one jenkins job.
> Currently I need to manually change the topic of my asterixdb gerrit CL
> every time before I update my hyracks CL, and then manually schedule
> jenkins to run a new asterixdb job.  If I forget to schedule the jenkins
> job, the asterixdb CL is still shown to be "verified by jenkins".
>
> >>2. We only just recently took the initiative to take Pregelix and
> Hiversterix *out* of the same repository, and that was because they were
> specifically >>causing us problems as components of the same build. (There
> were issues of competing dependency versions with Ian's YARN work, as well
> as >>several spurious pregelix test failures, as I recall.) At a bare
> minimum, we cannot merge those projects back in without re-researching and
> addressing >>those problems.
>
> Those will be definitely be fixed before Pregelix and IMRU are merged
> back.  Hivesterix is dead and will not be merged. I'm not proposing that we
> should bring Pregelix and IMRU in now but to do that later when they are
> ready.
>
> Best,
> Yingyi
>
>
>
>
> On Mon, Jun 1, 2015 at 5:15 PM, Chris Hillery <chillery@lambda.nu> wrote:
>
> > My $.02 - no, we shouldn't.
> >
> > Two main reasons:
> >
> > 1. If we're serious about Hyracks being a re-usable component of other
> > products, it makes sense to dogfood that in Asterixdb. If there are
> > problems keeping Hyracks separate from Asterix or keeping Hyracks with
> > clean interfaces, this forces us to address them.
> >
> > 2. We only just recently took the initiative to take Pregelix and
> > Hiversterix *out* of the same repository, and that was because they were
> > specifically causing us problems as components of the same build. (There
> > were issues of competing dependency versions with Ian's YARN work, as
> well
> > as several spurious pregelix test failures, as I recall.) At a bare
> > minimum, we cannot merge those projects back in without re-researching
> and
> > addressing those problems.
> >
> > What benefits would we gain by merging them? I honestly don't agree with
> > Yingyi's suggestion that it would make building, bug-fixing, and code
> > review much simpler. At best it would help a bit on those occasions when
> a
> > change spans Hyracks and Asterix, and again, IMHO that is something that
> > *should* require additional thought and oversight. As for build and test,
> > my feeling is that it will make it considerably harder, or at the very
> > least slower, simply due to doubling the Maven overhead.
> >
> > I do not feel that merging the projects to either fit in better with
> > Apache, or to game the Apache popularity indexes, is a good trade-off.
> >
> > Ceej
> > aka Chris Hillery
> >
> > On Mon, Jun 1, 2015 at 12:02 PM, Yingyi Bu <buyingyi@gmail.com> wrote:
> >
> >> Hi folks,
> >>
> >>     Should we merge hyracks, asterixdb, and potentially pregelix/imru
> >> into the same repository?   It will make build, fix, and code review
> >> process much simpler.
> >>     An example is that everything built on top of Spark lives in the
> same
> >> repository:  https://github.com/apache/spark.   That's also why Spark
> is
> >> the most active Apache project now, due to its commit frequency.
> >>     Does anyone have concerns for merging the hyracks and asterixdb
> >> repositories?
> >>     Thanks!
> >>
> >> Best,
> >> Yingyi
> >>
> >>
> >> On Wed, Apr 22, 2015 at 10:13 PM, Till Westmann <tillw@apache.org>
> wrote:
> >>
> >>> Ok, let’s find out what is the “more work” part before we decide :)
> >>>
> >>> We should already have the SGA (as it’s part of the SGA that Mike sent
> >>> in) and it seemed to me that all we’re need to do “later” (e.g. next
> >>> week/month) would be to
> >>> a) vote on bringing it into AsterixDB (that would be an incubator vote
> I
> >>> assume) and
> >>> b) asking infra for another git repository.
> >>> So the extra work would be the vote on the incubator list.
> >>> Is that right or is there something else we’d need to do?
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Apr 22, 2015, at 10:04 PM, Mattmann, Chris A (3980) <
> >>> chris.a.mattmann@jpl.nasa.gov> wrote:
> >>>
> >>> Hey Mike and team,
> >>>
> >>> Thanks for bringing this to the list. I think these are precisely
> >>> the type of conversations that we want to have here at the ASF and
> >>> as part of our Incubating project. Having these discussions in the
> >>> community here at the ASF (which is now the Apache AsterixDB community)
> >>> is great.
> >>>
> >>> My opinion - it’s fine either way. I’m happy if you guys want to
> >>> bring Pregelix into the code base here via AsterixDB. It’s easily
> >>> reversible and incremental. If you want to spin out Pregelix later
> >>> as its own TLP and it’s shown to have its own community we can
> >>> file a board resolution to do that. Heck, nothing stops us from
> >>> graduating 2 Incubator projects=>TLPs out of this effort even in
> >>> the Incubator. That’s fine. If you want to wait and bring it in
> >>> later, it will definitely be more work - so let’s call a spade a
> >>> spade there. But if you want to do that that’s fine too.
> >>>
> >>> My personal recommendation - bring it in - won’t hurt and we can
> >>> always pivot in the ways above later.
> >>>
> >>> Cheers,
> >>> Chris
> >>>
> >>>
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> Chris Mattmann, Ph.D.
> >>> Chief Architect
> >>> Instrument Software and Science Data Systems Section (398)
> >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>> Office: 168-519, Mailstop: 168-527
> >>> Email: chris.a.mattmann@nasa.gov
> >>> WWW:  http://sunset.usc.edu/~mattmann/
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> Adjunct Associate Professor, Computer Science Department
> >>> University of Southern California, Los Angeles, CA 90089 USA
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Michael Carey <mjcarey@ics.uci.edu>
> >>> Date: Tuesday, April 21, 2015 at 11:49 AM
> >>> To: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov>, Till Westmann
> >>> <till@westmann.org>
> >>> Cc: Chris Hillery <chillery@lambda.nu>, Ian Maxon <imaxon@uci.edu>,
> >>> Yingyi
> >>> Bu <buyingyi@gmail.com>, "dev@asterixdb.incubator.apache.org"
> >>> <dev@asterixdb.incubator.apache.org>
> >>> Subject: Re: Migration of git repository
> >>>
> >>> Sure!  Let me clarify the issue for everyone (and broaden the
> question).
> >>>
> >>> One of the technical by-products of the AsterixDB project is a graph
> >>> analytics package called Pregelix - as the name suggests, it is a
> "knock
> >>> off" of Pregel, as are packages like Giraph.  What's unique about
> >>> Pregelix is that it actually scales without OOM'ing
> >>> - under the covers it uses database join processing techniques.  You
> can
> >>> find out more about it by visiting
> >>> http://pregelix.ics.uci.edu/ and/or by skimming the attached paper -
> >>> check out the experimental results compared to other popular
> >>> alternatives.  Anyway, we have made it freely available (as we do all
> of
> >>> our AsterixDB-related
> >>> research products) and we were thinking that we should simply include
> it
> >>> under the AsterixDB project - kind of like Spark has subprojects for
> SQL,
> >>> streams, graphs, etc.  As a result, I listed it on the list of
> >>> transferred artifacts when I sent in the licensing
> >>> form the other day.  (So we at least have that step done.)  Its code
> >>> conntributors have been a small subset of the AsterixDB team; it was a
> >>> small sub-project, basically.  (Mostly just Yingyi Bu!)
> >>>
> >>> Pregelix is kind of a sibling of Apache VXQuery in that its runtime is
> >>> based on Hyracks but it hasn't otherwise been AsterixDB-dependent.
> >>> However, we have just finished teaching it to read/write directly from
> >>> AsterixDB native storage - instead of just HDFS
> >>> - so now it has an AsterixDB dependency, and we are using it as a
> >>> driving example of how to couple AsterixDB to other analytic engines.
> >>>
> >>> Rather than going through another exercise to open-source this
> >>> separately, it seemed like we could take this approach.
> >>>
> >>> Thoughts?
> >>> Cheers,
> >>> Mike
> >>>
> >>>
> >>> On 4/21/15 7:45 AM, Mattmann, Chris A (3980) wrote:
> >>>
> >>>
> >>> Yes, in fact, this whole conversations should be happening on
> >>> the dev list. OK for me to CC them on my reply?
> >>>
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> Chris Mattmann, Ph.D.
> >>> Chief Architect
> >>> Instrument Software and Science Data Systems Section (398)
> >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>> Office: 168-519, Mailstop: 168-527
> >>> Email: chris.a.mattmann@nasa.gov
> >>> WWW:  http://sunset.usc.edu/~mattmann/
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> Adjunct Associate Professor, Computer Science Department
> >>> University of Southern California, Los Angeles, CA 90089 USA
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: "Michael J. Carey" <mjcarey@ics.uci.edu>
> >>> <mailto:mjcarey@ics.uci.edu <mjcarey@ics.uci.edu>>
> >>> Date: Tuesday, April 21, 2015 at 3:13 AM
> >>> To: Till Westmann <till@westmann.org> <mailto:till@westmann.org
> >>> <till@westmann.org>>
> >>> Cc: Chris Hillery <chillery@lambda.nu> <mailto:chillery@lambda.nu
> >>> <chillery@lambda.nu>>, Ian
> >>> Maxon <imaxon@uci.edu> <mailto:imaxon@uci.edu <imaxon@uci.edu>>,
> Yingyi
> >>> Bu <buyingyi@gmail.com> <mailto:buyingyi@gmail.com <buyingyi@gmail.com
> >>,
> >>> Chris Mattmann
> >>> <Chris.A.Mattmann@jpl.nasa.gov> <mailto:Chris.A.Mattmann@jpl.nasa.gov
> >>> <Chris.A.Mattmann@jpl.nasa.gov>>
> >>> Subject: Re: Migration of git repository
> >>>
> >>> + Yingyi on the Pregelix Q.  Should we also ask Chris M for advice on
> >>> that?
> >>> On Apr 20, 2015 4:23 PM, "Till Westmann" <till@westmann.org>
> >>> <mailto:till@westmann.org <till@westmann.org>> wrote:
> >>>
> >>> Hi Ian,
> >>>
> >>>
> >>> That’s a good question - and I don’t know the answer.
> >>> We’ve got 2 repos so far:
> >>>
> >>>
> https://issues.apache.org/jira/browse/INFRA-9212https://issues.apache.org/
> >>> jira/browse/INFRA-9306
> >>> so we should have space for Hyracks and AsterixDB.
> >>>
> >>>
> >>> I think that there’s an open questions about Pregelix, but maybe that
> >>> shouldn’t keep us from going ahead.
> >>>
> >>>
> >>> I further think that it would be great if you could send an e-mail to
> >>> dev@asterixdb.incubator.apache.org<
> >>> mailto:dev@asterixdb.incubator.apache.o
> >>> <dev@asterixdb.incubator.apache.o>
> >>> rg> <mailto:dev@asterixdb.incubator.apache.org
> >>> <dev@asterixdb.incubator.apache.org>> and ask if it’s ok to
> >>> import
> >>> our git repo(s) or if something else needs to be done first. (I could
> >>> send that e-mail as well, but it would be great if there were more
> >>> non-Till e0mails on the list :) )
> >>>
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>>
> >>> On Apr 20, 2015, at 4:07 PM, Ian Maxon <imaxon@uci.edu>
> >>> <mailto:imaxon@uci.edu <imaxon@uci.edu>> wrote:
> >>>
> >>> Hi Mike, Chris and Till,
> >>>
> >>>
> >>> Since (I think?) the paperwork for the software grant is done now,
> should
> >>> I copy our GC branches over to the ASF git repositories now ( as well
> as
> >>> making it a mirror in the Gerrit commit hook script)?
> >>>
> >>>
> >>> Thanks,
> >>> - Ian
> >>>
> >>>
> >>>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message