asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yingyi Bu <buyin...@gmail.com>
Subject Re: Migration of git repository
Date Tue, 02 Jun 2015 04:46:52 GMT
Chris,

Thanks for the input!!

>>1. If we're serious about Hyracks being a re-usable component of other
products, it makes sense to dogfood that in Asterixdb. If there are
problems ?>>keeping Hyracks separate from Asterix or keeping Hyracks with
clean interfaces, this forces us to address them.

In my opinion,  merging the repository doesn't break the separation of
hyracks and asterixdb, because the dependencies are controlled by mvn pom
files. We just make the code physically live together under the root
directory, one is hyracks as it is and the other is asterixdb as it is.
For example, Spark lives together with all the things on top of it and that
doesn't seem to prevent its reusability. Hadoop lives together with
Hive/Pig/Zookeeper in the same repo until year 2010 when it is very stable.

Currently almost all my changes are spanning hyracks and asterixdb.  I
believe many people also suffer from that.  Merging them together will have
the following benefits:
1) It forces those hyracks-only changes to pass asterixdb regression
tests.  Currently hyracks-only change are not verified by asterixdb tests.
2) On my local machine,  I don't need to always install hyracks and then
verify asterixdb from time to time.  Especially, switching branches seems
painful because the installed hyracks snapshot is overwritten from time to
time.
3) I only need to make one code review request and one jenkins job.
Currently I need to manually change the topic of my asterixdb gerrit CL
every time before I update my hyracks CL, and then manually schedule
jenkins to run a new asterixdb job.  If I forget to schedule the jenkins
job, the asterixdb CL is still shown to be "verified by jenkins".

>>2. We only just recently took the initiative to take Pregelix and
Hiversterix *out* of the same repository, and that was because they were
specifically >>causing us problems as components of the same build. (There
were issues of competing dependency versions with Ian's YARN work, as well
as >>several spurious pregelix test failures, as I recall.) At a bare
minimum, we cannot merge those projects back in without re-researching and
addressing >>those problems.

Those will be definitely be fixed before Pregelix and IMRU are merged
back.  Hivesterix is dead and will not be merged. I'm not proposing that we
should bring Pregelix and IMRU in now but to do that later when they are
ready.

Best,
Yingyi




On Mon, Jun 1, 2015 at 5:15 PM, Chris Hillery <chillery@lambda.nu> wrote:

> My $.02 - no, we shouldn't.
>
> Two main reasons:
>
> 1. If we're serious about Hyracks being a re-usable component of other
> products, it makes sense to dogfood that in Asterixdb. If there are
> problems keeping Hyracks separate from Asterix or keeping Hyracks with
> clean interfaces, this forces us to address them.
>
> 2. We only just recently took the initiative to take Pregelix and
> Hiversterix *out* of the same repository, and that was because they were
> specifically causing us problems as components of the same build. (There
> were issues of competing dependency versions with Ian's YARN work, as well
> as several spurious pregelix test failures, as I recall.) At a bare
> minimum, we cannot merge those projects back in without re-researching and
> addressing those problems.
>
> What benefits would we gain by merging them? I honestly don't agree with
> Yingyi's suggestion that it would make building, bug-fixing, and code
> review much simpler. At best it would help a bit on those occasions when a
> change spans Hyracks and Asterix, and again, IMHO that is something that
> *should* require additional thought and oversight. As for build and test,
> my feeling is that it will make it considerably harder, or at the very
> least slower, simply due to doubling the Maven overhead.
>
> I do not feel that merging the projects to either fit in better with
> Apache, or to game the Apache popularity indexes, is a good trade-off.
>
> Ceej
> aka Chris Hillery
>
> On Mon, Jun 1, 2015 at 12:02 PM, Yingyi Bu <buyingyi@gmail.com> wrote:
>
>> Hi folks,
>>
>>     Should we merge hyracks, asterixdb, and potentially pregelix/imru
>> into the same repository?   It will make build, fix, and code review
>> process much simpler.
>>     An example is that everything built on top of Spark lives in the same
>> repository:  https://github.com/apache/spark.   That's also why Spark is
>> the most active Apache project now, due to its commit frequency.
>>     Does anyone have concerns for merging the hyracks and asterixdb
>> repositories?
>>     Thanks!
>>
>> Best,
>> Yingyi
>>
>>
>> On Wed, Apr 22, 2015 at 10:13 PM, Till Westmann <tillw@apache.org> wrote:
>>
>>> Ok, let’s find out what is the “more work” part before we decide :)
>>>
>>> We should already have the SGA (as it’s part of the SGA that Mike sent
>>> in) and it seemed to me that all we’re need to do “later” (e.g. next
>>> week/month) would be to
>>> a) vote on bringing it into AsterixDB (that would be an incubator vote I
>>> assume) and
>>> b) asking infra for another git repository.
>>> So the extra work would be the vote on the incubator list.
>>> Is that right or is there something else we’d need to do?
>>>
>>> Cheers,
>>> Till
>>>
>>> On Apr 22, 2015, at 10:04 PM, Mattmann, Chris A (3980) <
>>> chris.a.mattmann@jpl.nasa.gov> wrote:
>>>
>>> Hey Mike and team,
>>>
>>> Thanks for bringing this to the list. I think these are precisely
>>> the type of conversations that we want to have here at the ASF and
>>> as part of our Incubating project. Having these discussions in the
>>> community here at the ASF (which is now the Apache AsterixDB community)
>>> is great.
>>>
>>> My opinion - it’s fine either way. I’m happy if you guys want to
>>> bring Pregelix into the code base here via AsterixDB. It’s easily
>>> reversible and incremental. If you want to spin out Pregelix later
>>> as its own TLP and it’s shown to have its own community we can
>>> file a board resolution to do that. Heck, nothing stops us from
>>> graduating 2 Incubator projects=>TLPs out of this effort even in
>>> the Incubator. That’s fine. If you want to wait and bring it in
>>> later, it will definitely be more work - so let’s call a spade a
>>> spade there. But if you want to do that that’s fine too.
>>>
>>> My personal recommendation - bring it in - won’t hurt and we can
>>> always pivot in the ways above later.
>>>
>>> Cheers,
>>> Chris
>>>
>>>
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Chief Architect
>>> Instrument Software and Science Data Systems Section (398)
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 168-519, Mailstop: 168-527
>>> Email: chris.a.mattmann@nasa.gov
>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Associate Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Michael Carey <mjcarey@ics.uci.edu>
>>> Date: Tuesday, April 21, 2015 at 11:49 AM
>>> To: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov>, Till Westmann
>>> <till@westmann.org>
>>> Cc: Chris Hillery <chillery@lambda.nu>, Ian Maxon <imaxon@uci.edu>,
>>> Yingyi
>>> Bu <buyingyi@gmail.com>, "dev@asterixdb.incubator.apache.org"
>>> <dev@asterixdb.incubator.apache.org>
>>> Subject: Re: Migration of git repository
>>>
>>> Sure!  Let me clarify the issue for everyone (and broaden the question).
>>>
>>> One of the technical by-products of the AsterixDB project is a graph
>>> analytics package called Pregelix - as the name suggests, it is a "knock
>>> off" of Pregel, as are packages like Giraph.  What's unique about
>>> Pregelix is that it actually scales without OOM'ing
>>> - under the covers it uses database join processing techniques.  You can
>>> find out more about it by visiting
>>> http://pregelix.ics.uci.edu/ and/or by skimming the attached paper -
>>> check out the experimental results compared to other popular
>>> alternatives.  Anyway, we have made it freely available (as we do all of
>>> our AsterixDB-related
>>> research products) and we were thinking that we should simply include it
>>> under the AsterixDB project - kind of like Spark has subprojects for SQL,
>>> streams, graphs, etc.  As a result, I listed it on the list of
>>> transferred artifacts when I sent in the licensing
>>> form the other day.  (So we at least have that step done.)  Its code
>>> conntributors have been a small subset of the AsterixDB team; it was a
>>> small sub-project, basically.  (Mostly just Yingyi Bu!)
>>>
>>> Pregelix is kind of a sibling of Apache VXQuery in that its runtime is
>>> based on Hyracks but it hasn't otherwise been AsterixDB-dependent.
>>> However, we have just finished teaching it to read/write directly from
>>> AsterixDB native storage - instead of just HDFS
>>> - so now it has an AsterixDB dependency, and we are using it as a
>>> driving example of how to couple AsterixDB to other analytic engines.
>>>
>>> Rather than going through another exercise to open-source this
>>> separately, it seemed like we could take this approach.
>>>
>>> Thoughts?
>>> Cheers,
>>> Mike
>>>
>>>
>>> On 4/21/15 7:45 AM, Mattmann, Chris A (3980) wrote:
>>>
>>>
>>> Yes, in fact, this whole conversations should be happening on
>>> the dev list. OK for me to CC them on my reply?
>>>
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Chief Architect
>>> Instrument Software and Science Data Systems Section (398)
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 168-519, Mailstop: 168-527
>>> Email: chris.a.mattmann@nasa.gov
>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Associate Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: "Michael J. Carey" <mjcarey@ics.uci.edu>
>>> <mailto:mjcarey@ics.uci.edu <mjcarey@ics.uci.edu>>
>>> Date: Tuesday, April 21, 2015 at 3:13 AM
>>> To: Till Westmann <till@westmann.org> <mailto:till@westmann.org
>>> <till@westmann.org>>
>>> Cc: Chris Hillery <chillery@lambda.nu> <mailto:chillery@lambda.nu
>>> <chillery@lambda.nu>>, Ian
>>> Maxon <imaxon@uci.edu> <mailto:imaxon@uci.edu <imaxon@uci.edu>>,
Yingyi
>>> Bu <buyingyi@gmail.com> <mailto:buyingyi@gmail.com <buyingyi@gmail.com>>,
>>> Chris Mattmann
>>> <Chris.A.Mattmann@jpl.nasa.gov> <mailto:Chris.A.Mattmann@jpl.nasa.gov
>>> <Chris.A.Mattmann@jpl.nasa.gov>>
>>> Subject: Re: Migration of git repository
>>>
>>> + Yingyi on the Pregelix Q.  Should we also ask Chris M for advice on
>>> that?
>>> On Apr 20, 2015 4:23 PM, "Till Westmann" <till@westmann.org>
>>> <mailto:till@westmann.org <till@westmann.org>> wrote:
>>>
>>> Hi Ian,
>>>
>>>
>>> That’s a good question - and I don’t know the answer.
>>> We’ve got 2 repos so far:
>>>
>>> https://issues.apache.org/jira/browse/INFRA-9212https://issues.apache.org/
>>> jira/browse/INFRA-9306
>>> so we should have space for Hyracks and AsterixDB.
>>>
>>>
>>> I think that there’s an open questions about Pregelix, but maybe that
>>> shouldn’t keep us from going ahead.
>>>
>>>
>>> I further think that it would be great if you could send an e-mail to
>>> dev@asterixdb.incubator.apache.org<
>>> mailto:dev@asterixdb.incubator.apache.o
>>> <dev@asterixdb.incubator.apache.o>
>>> rg> <mailto:dev@asterixdb.incubator.apache.org
>>> <dev@asterixdb.incubator.apache.org>> and ask if it’s ok to
>>> import
>>> our git repo(s) or if something else needs to be done first. (I could
>>> send that e-mail as well, but it would be great if there were more
>>> non-Till e0mails on the list :) )
>>>
>>>
>>> Cheers,
>>> Till
>>>
>>>
>>> On Apr 20, 2015, at 4:07 PM, Ian Maxon <imaxon@uci.edu>
>>> <mailto:imaxon@uci.edu <imaxon@uci.edu>> wrote:
>>>
>>> Hi Mike, Chris and Till,
>>>
>>>
>>> Since (I think?) the paperwork for the software grant is done now, should
>>> I copy our GC branches over to the ASF git repositories now ( as well as
>>> making it a mirror in the Gerrit commit hook script)?
>>>
>>>
>>> Thanks,
>>> - Ian
>>>
>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message