incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Lawrence <stephen.d.lawre...@gmail.com>
Subject Re: [DISCUSS] Daffodil Incubation Proposal
Date Wed, 09 Aug 2017 15:55:31 GMT
Sounds good to me. Can I start a vote, or is something a champion/mentor
would normally start? The project also does not have a champion--is that
necessary/would either of you be interested in being the champion?

Thanks,
- Steve

On 08/08/2017 10:59 PM, Dave Fisher wrote:
> Hi -
> 
> I agree. I'm willing to proceed with John and I as Mentors.
> 
> Regards,
> Dave
> 
> Sent from my iPhone
> 
>> On Aug 8, 2017, at 7:10 PM, John D. Ament <johndament@apache.org> wrote:
>>
>> Steve,
>>
>> At this point, I'd recommend we wrap the discussion and call for a vote.  While ideally
we want 3 mentors, we can get started with 2 and see how things progress.
>>
>> John
>>
>>> On Wed, Aug 2, 2017 at 3:55 PM Steve Lawrence <stephen.d.lawrence@gmail.com>
wrote:
>>> Thanks John!
>>>
>>> On 08/02/2017 03:23 PM, John D. Ament wrote:
>>>> You can also count me in as a mentor.
>>>>
>>>> John
>>>>
>>>> On Wed, Aug 2, 2017 at 3:14 PM Steve Lawrence <stephen.d.lawrence@gmail.com>
>>>> wrote:
>>>>
>>>>> Understood. Thanks for the interest!
>>>>>
>>>>> - Steve
>>>>>
>>>>> On 08/02/2017 02:57 PM, Dave Fisher wrote:
>>>>>> Hi Steve,
>>>>>>
>>>>>> It was not so much the lack of committers as it was the current
>>>>> diversity. That is not a blocker for entry to Incubation.
>>>>>>
>>>>>> I am willing to be one of the Mentors. Once there are at least two
more
>>>>> we can push forward.
>>>>>>
>>>>>> Regards,
>>>>>> Dave
>>>>>>
>>>>>>> On Aug 1, 2017, at 5:09 AM, Steve Lawrence <
>>>>> stephen.d.lawrence@gmail.com> wrote:
>>>>>>>
>>>>>>> Discussions have died down, and I think the consensus from the
responses
>>>>>>> is that the issues are 1) the lack of committers and 2) the lack
of a
>>>>>>> champion and mentors. We hope to address #1 and grow the community
as
>>>>>>> part of incubation. Is anyone interested in being a champion
or mentor
>>>>>>> and help us with #2?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> - Steve
>>>>>>>
>>>>>>> On 07/26/2017 04:06 PM, Chris Mattmann wrote:
>>>>>>>> This sounds like a very interesting project.
>>>>>>>>
>>>>>>>> I don’t have the time to mentor at the moment but I will
keep a close
>>>>> eye on it.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Chris Mattmann
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/25/17, 11:53 AM, "McHenry, Kenton Guadron" <mchenry@illinois.edu>
>>>>> wrote:
>>>>>>>>
>>>>>>>>    Hi Dave,
>>>>>>>>
>>>>>>>>    The developers that were at NCSA have moved on to other
>>>>> organizations.  While we still leverage Daffodil and are very much
>>>>> interested in seeing it move forward, development is currently done by
the
>>>>> Tresys team.  Agreed on the synergy with Tika.
>>>>>>>>
>>>>>>>>    Kenton McHenry, Ph.D.
>>>>>>>>    Principal Research Scientist, Adjunct Assistant Professor
of
>>>>> Computer Science
>>>>>>>>    Deputy Director of the Scientific Software & Applications
Division
>>>>>>>>    National Center for Supercomputing Applications, University
of
>>>>> Illinois at Urbana-Champaign
>>>>>>>>
>>>>>>>>    On Jul 24, 2017, at 1:55 PM, Dave Fisher <dave2wave@comcast.net
>>>>> <mailto:dave2wave@comcast.net>> wrote:
>>>>>>>>
>>>>>>>>    Hi Kenton,
>>>>>>>>
>>>>>>>>    Is there any reason that you and others from the NCSA
are not
>>>>> Initial Committers? That would make this proposal stronger.
>>>>>>>>
>>>>>>>>    Regarding Apache Tika - it relies on other projects including
>>>>> Apache POI and Apache PDFBox. They are pragmatic about what is used.
If
>>>>> Daffodil works to expand then I think that there would be good synergy
>>>>> between the projects. I know as a POI PMC member that the POI community
has
>>>>> significantly benefited from the Tika community some of whom are from
Mitre.
>>>>>>>>
>>>>>>>>    To date Tika has not emphasized structured data, although
they do
>>>>> extract content from Excel and OpenOffice.
>>>>>>>>
>>>>>>>>    I am intrigued.
>>>>>>>>
>>>>>>>>    Regards,
>>>>>>>>    Dave
>>>>>>>>
>>>>>>>>    On Jul 24, 2017, at 10:55 AM, McHenry, Kenton Guadron
<
>>>>> mchenry@illinois.edu<mailto:mchenry@illinois.edu>> wrote:
>>>>>>>>
>>>>>>>>    Yes, DFDL and its open source implementation Daffodil
are more
>>>>> about file formats and getting access to the entirety of a file's contents
>>>>> in a consistent way through machine readable specifications.  The work
has
>>>>> implications in the area of digital preservation allowing one to preserve
>>>>> these machine readable specifications rather than all the tools needed
to
>>>>> open/save a file in order to work with it.  Imagine someone developing
>>>>> graphics software to work with 3D models and not having to worry about
the
>>>>> hundreds of formats out there for 3D meshes (whether there are tools
for
>>>>> opening the files and whether they can get access to those tools, whether
>>>>> the spec is available and worrying about how complex that spec is to
>>>>> implement, etc.), and simply building their code around the contents
(e.g.
>>>>> vertices, faces, etc.).  One could come up with similar scenarios for
other
>>>>> data types (documents, images, videos, audio, depth data, numeric data).
>>>>> Ideally tools built supporting DFDL, could someday, support any format
for
>>>>> that type without the developer having to worry about the details of
how
>>>>> that data is represented within a file.
>>>>>>>>
>>>>>>>>    Kenton McHenry, Ph.D.
>>>>>>>>    Principal Research Scientist, Adjunct Assistant Professor
of
>>>>> Computer Science
>>>>>>>>    Deputy Director of the Scientific Software & Applications
Division
>>>>>>>>    National Center for Supercomputing Applications, University
of
>>>>> Illinois at Urbana-Champaign
>>>>>>>>
>>>>>>>>    On Jul 24, 2017, at 10:30 AM, Steve Lawrence <
>>>>> stephen.d.lawrence@gmail.com<mailto:stephen.d.lawrence@gmail.com><mailto:
>>>>> stephen.d.lawrence@gmail.com>> wrote:
>>>>>>>>
>>>>>>>>    I'll preface this saying that I don't have a ton of experience
with
>>>>>>>>    Apache Tika. But based on my understanding, Tika and Daffodil
do
>>>>> have
>>>>>>>>    somewhat similar goals, but reach them in different ways.
For
>>>>> example,
>>>>>>>>    Tika requires that one writes /code/ to perform data extraction,
>>>>> usually
>>>>>>>>    relying on existing Java libraries to extract the desired
metadata.
>>>>> The
>>>>>>>>    downside to this is that code can be buggy, and libraries
might not
>>>>> even
>>>>>>>>    exist for formats of interest (especially common with
legacy and
>>>>>>>>    military data).
>>>>>>>>
>>>>>>>>    Daffodil, on the other hand, does not require one to write
any code.
>>>>>>>>    Instead, one writes a DFDL Schema (similar to XML Schema,
with DFDL
>>>>>>>>    annotations) that fully describes the data, which Daffodil
then
>>>>> uses to
>>>>>>>>    convert the data to XML/JSON for extraction. So adding
support for
>>>>> a new
>>>>>>>>    format means writing a new schema rather than new code.
And less
>>>>> code
>>>>>>>>    generally means less bugs. Also, for secure systems that
require
>>>>>>>>    certification, generally speaking, it is easier to certify
a schema
>>>>> as
>>>>>>>>    compared to code.
>>>>>>>>
>>>>>>>>    We certainly don't believe that Daffodil could replace
Tika, but it
>>>>> does
>>>>>>>>    have the potential to add new functionality to Tika for
formats
>>>>> that do
>>>>>>>>    not have existing libraries. One of our goals is to look
into
>>>>>>>>    integrating Daffodil support into tools like Tika. We'd
love to hear
>>>>>>>>    from Tika devs if this is something they'd be interested
in.
>>>>>>>>
>>>>>>>>    I'll also add that whereas Tika tends to focus primarily
on
>>>>> metadata,
>>>>>>>>    DFDL schemas usually describe an entire file format down
to the
>>>>> byte, so
>>>>>>>>    one can extract more than just meta data, including text
and binary
>>>>>>>>    data. Further differentiating, Daffodil has support for
serializing
>>>>> data
>>>>>>>>    (called unparse) from the XML/JSON representation, allowing
one to
>>>>>>>>    transform or filter data as well. We don't believe this
feature is
>>>>> all
>>>>>>>>    that applicable to Tika, but may be useful to other technologies
>>>>> such as
>>>>>>>>    filtering or data fuzzing technologies.
>>>>>>>>
>>>>>>>>    - Steve
>>>>>>>>
>>>>>>>>
>>>>>>>>    On 07/24/2017 10:59 AM, Mike Drob wrote:
>>>>>>>>    What is the relationship between Daffodil and something
like Apache
>>>>> Tika's
>>>>>>>>    extraction engine?
>>>>>>>>
>>>>>>>>    On Mon, Jul 24, 2017 at 9:53 AM, Steve Lawrence <
>>>>>>>>    stephen.d.lawrence@gmail.com<mailto:stephen.d.lawrence@gmail.com
>>>>>> <mailto:stephen.d.lawrence@gmail.com>> wrote:
>>>>>>>>
>>>>>>>>    Dear Apache Incubator Community,
>>>>>>>>
>>>>>>>>    We would like to start a discussion around a proposal
to bring
>>>>> Daffodil
>>>>>>>>    into the Apache Incubator. Daffodil is a implementation
of the DFDL
>>>>>>>>    specification used to convert between fixed format data
and
>>>>> XML/JSON.
>>>>>>>>
>>>>>>>>    The draft proposal can be found in the wiki at the following
URL:
>>>>>>>>
>>>>>>>>    https://wiki.apache.org/incubator/DaffodilProposal
>>>>>>>>
>>>>>>>>    We do not yet have a champion or mentors, but it was recommended
>>>>> that we
>>>>>>>>    create a proposal and send it to this list to potentially
find those
>>>>>>>>    that might be interested. The text for the draft proposal
is found
>>>>>>>>    below. We look forward to your input.
>>>>>>>>
>>>>>>>>    Thanks,
>>>>>>>>    -Steve
>>>>>>>>
>>>>>>>>
>>>>>>>>    = Daffodil Proposal =
>>>>>>>>
>>>>>>>>    == Abstract ==
>>>>>>>>
>>>>>>>>    Daffodil is an implementation of the Data Format Description
>>>>> Language
>>>>>>>>    (DFDL) used to convert between fixed format data and XML/JSON.
>>>>>>>>
>>>>>>>>    == Proposal ==
>>>>>>>>
>>>>>>>>    The Data Format Description Language (DFDL) is a specification,
>>>>>>>>    developed by the Open Grid Forum, capable of describing
many data
>>>>>>>>    formats, including both textual and binary, scientific
and numeric,
>>>>>>>>    legacy and modern, commercial record-oriented, and many
industry and
>>>>>>>>    military standards. It defines a language that is a subset
of W3C
>>>>> XML
>>>>>>>>    schema to describe the logical format of the data, and
annotations
>>>>>>>>    within the schema to describe the physical representation.
>>>>>>>>
>>>>>>>>    Daffodil is an open source implementation of the DFDL
specification
>>>>> that
>>>>>>>>    uses these DFDL schemas to parse fixed format data into
an infoset,
>>>>>>>>    which is most commonly represented as either XML or JSON.
This
>>>>> allows
>>>>>>>>    the use of well-established XML or JSON technologies and
libraries
>>>>> to
>>>>>>>>    consume, inspect, and manipulate fixed format data in
existing
>>>>>>>>    solutions. Daffodil is also capable of the reverse by
serializing or
>>>>>>>>    "unparsing" an XML or JSON infoset back to the original
data format.
>>>>>>>>
>>>>>>>>    == Background ==
>>>>>>>>
>>>>>>>>    Many different software solutions need to consume and
manage data,
>>>>>>>>    including data directed routing, databases, data analysis,
data
>>>>>>>>    cleansing, data visualizing, and more. A key aspect of
such
>>>>> solutions is
>>>>>>>>    the need to transform the data into an easily consumable
format.
>>>>>>>>    Usually, this means that for each unique data format,
one develops a
>>>>>>>>    tool that can read and extract the necessary information,
often
>>>>> leading
>>>>>>>>    to ad-hoc and data-format-specific description systems.
Such
>>>>> systems are
>>>>>>>>    often proprietary, not well tested, and incompatible,
leading to
>>>>> vendor
>>>>>>>>    lock-in, flawed software, and increased training costs.
DFDL is a
>>>>> new
>>>>>>>>    standard, with version 1.0 completed in October of 2016,
that solves
>>>>>>>>    these problems by defining an open standard to describe
many
>>>>> different
>>>>>>>>    data formats and how to parse and unparse between the
data and
>>>>> XML/JSON.
>>>>>>>>
>>>>>>>>    Two closed source implementations of DFDL currently exist.
The
>>>>> first was
>>>>>>>>    created by IBM and is now part of their IBM® Integration
Bus
>>>>> product.
>>>>>>>>    The second was created by the European Space Agency, called
DFDL4S
>>>>> or
>>>>>>>>    "DFDL for Space" targeted at the challenges of their satellite
data
>>>>>>>>    processing.
>>>>>>>>
>>>>>>>>    Around 2005, Pacific Northwest National Lab created Defuddle,
built
>>>>> as
>>>>>>>>    an open source implementation and proof of concept of
the draft DFDL
>>>>>>>>    specification and a test bed to feed new concepts into
specification
>>>>>>>>    development. Primary development of Defuddle was eventually
taken
>>>>> over
>>>>>>>>    by the National Center for Supercomputing Applications
(NCSA).
>>>>> However,
>>>>>>>>    due to evolution of the DFDL specification and architectural
and
>>>>>>>>    performance issues with Defuddle, around 2009, NCSA restarted
the
>>>>>>>>    project with the new name of Daffodil, with a goal of
implementing
>>>>> the
>>>>>>>>    complete DFDL specification. Daffodil development continued
at NCSA
>>>>>>>>    until around 2012, at which point development slowed due
to budget
>>>>>>>>    limitations. Shortly thereafter, primary development was
picked up
>>>>> by
>>>>>>>>    Tresys Technology where it continues today, with contributions
from
>>>>>>>>    other entities such as the Navy Research Lab, the Air
Force Research
>>>>>>>>    Lab, MITRE, and Booz Allen Hamilton. In February of 2015,
Daffodil
>>>>>>>>    version 1.0.0 was released, including support for the
DFDL features
>>>>>>>>    needed to parse many common file formats. Daffodil version
2.0.0 is
>>>>>>>>    expected to be released in August of 2017, which will
include
>>>>> unparse
>>>>>>>>    support with one-to-one parsing feature parity.
>>>>>>>>
>>>>>>>>    Entities including IBM, MITRE, NATO NCI Agency, Northrop-Grumman,
>>>>> Quark
>>>>>>>>    Security, Raytheon, and Tresys Technology have developed
DFDL
>>>>> schemas
>>>>>>>>    for many data formats from varying technology domains,
including
>>>>> PNG,
>>>>>>>>    GIF, BMP, PCAP, HL7, EDIFACT, NACHA, vCard, iCalendar,
and
>>>>> MIL-STD-2045,
>>>>>>>>    many of which are publicly available on the DFDL Schemas
github.
>>>>> There
>>>>>>>>    are also a number of military-application data formats,
the
>>>>>>>>    specifications of which are not public, which have historically
been
>>>>>>>>    very difficult and expensive to process, and for which
DFDL schemas
>>>>> have
>>>>>>>>    been created or are actively in development; these include
>>>>>>>>    MIL-STD-6040/USMTF ATO, MIL-STD-6017/VMF, MIL-STD-6016/NATO
STANAG
>>>>> 5516
>>>>>>>>    (aka "Link16").
>>>>>>>>
>>>>>>>>    == Rationale ==
>>>>>>>>
>>>>>>>>    Numerous software solutions exist that consume, inspect,
analyze,
>>>>> and
>>>>>>>>    transform data, many of which can be found in the Apache
Software
>>>>>>>>    Foundation (ASF). In order for tools like these to consume
new
>>>>> types of
>>>>>>>>    data, custom extensions are usually required, often with
high
>>>>>>>>    development and testing costs. Daffodil fills a clear
gap in many of
>>>>>>>>    these solutions, providing a simple and low cost way to
transform
>>>>> data
>>>>>>>>    to XML or JSON, which many of these tools natively support
already.
>>>>> With
>>>>>>>>    the upcoming 2.0.0 release, the Daffodil project will
have achieved
>>>>> a
>>>>>>>>    level of functionality in both parse and unparse that,
when
>>>>> integrated
>>>>>>>>    into existing solutions, could provide for a new method
to quickly
>>>>>>>>    enable support for new data formats.
>>>>>>>>
>>>>>>>>    == Initial Goals ==
>>>>>>>>
>>>>>>>>    * Relicense the existing code from the University of Illinois/NCSA
>>>>> Open
>>>>>>>>    Source License to the Apache License version 2.0, working
with
>>>>> Apache
>>>>>>>>    Legal to ensure correctness, and with Daffodil contributors
to get
>>>>>>>>    their permission.
>>>>>>>>    * Move the existing codebase, documentation, bugs, and
mailing
>>>>> lists to
>>>>>>>>    the Apache hosted infrastructure
>>>>>>>>    * Establish a formal release process and schedule, allowing
for
>>>>>>>>    dependable release cycles in a manner consistent with
the Apache
>>>>>>>>    development process.
>>>>>>>>    * Build relationships with ASF projects to add Daffodil
support
>>>>> where
>>>>>>>>    appropriate
>>>>>>>>    * Grow the community to establish a diversity of background
and
>>>>> expertise.
>>>>>>>>
>>>>>>>>    == Current Status ==
>>>>>>>>
>>>>>>>>    === Meritocracy ===
>>>>>>>>
>>>>>>>>    All initial committers are familiar with the principles
of
>>>>> meritocracy.
>>>>>>>>    The Daffodil project has followed the model of meritocracy
in the
>>>>> past,
>>>>>>>>    providing multiple outside entities commit access based
on the
>>>>> quality
>>>>>>>>    of their contributions. In order to grow the Daffodil
user base and
>>>>>>>>    development community, we are dedicated to continuing
to operate
>>>>>>>>    Daffodil as a meritocracy.
>>>>>>>>
>>>>>>>>    A key ingredient in a meritocracy of developers is open
group code
>>>>>>>>    review. The Daffodil project has operated in this mode
throughout
>>>>> its
>>>>>>>>    existence and this provides a forum to improve the code,
verify code
>>>>>>>>    quality, and educate new developers on the code base.
>>>>>>>>
>>>>>>>>    === Community ===
>>>>>>>>
>>>>>>>>    Daffodil has a small community of users and developers.
Although
>>>>> primary
>>>>>>>>    Daffodil development is done by Tresys Technology, a handful
of
>>>>> other
>>>>>>>>    contributions have come from other entities including
the Navy
>>>>> Research
>>>>>>>>    Lab, the Air Force Research Lab, MITRE, and Booz Allen
Hamilton. In
>>>>>>>>    addition to developers, multiple users of Daffodil have
created DFDL
>>>>>>>>    schemas, including entities such as MITRE, IBM, Raytheon,
Quark
>>>>>>>>    Security, and Tresys Technology. The DFDL Schemas github
community
>>>>> has
>>>>>>>>    been created as a place for DFDL schemas to be published.
The
>>>>> Daffodil
>>>>>>>>    project also makes use of mailing lists, !HipChat, and
Confluence
>>>>>>>>    Questions to build a community of users and system for
support.
>>>>>>>>
>>>>>>>>    === Core Developers ===
>>>>>>>>
>>>>>>>>    The core developers of Daffodil are employed by Tresys
Technology.
>>>>> We
>>>>>>>>    will work to grow the community among a more diverse set
of
>>>>> developers
>>>>>>>>    and industries.
>>>>>>>>
>>>>>>>>    === Alignment ===
>>>>>>>>
>>>>>>>>    Daffodil was created as an open source project with a
philosophy
>>>>>>>>    consistent with The Apache Way. A strong belief in meritocracy,
>>>>>>>>    community involvement in decisions, openness, and ensuring
a high
>>>>> level
>>>>>>>>    of quality in code, documentation, and testing are some
of our
>>>>> shared
>>>>>>>>    core beliefs.
>>>>>>>>
>>>>>>>>    Further, as mentioned in the Rationale section, Daffodil
fills a gap
>>>>>>>>    that exists in many ASF projects, including !NiFi, Spark,
Storm,
>>>>> Hadoop,
>>>>>>>>    Tika, and others. In order for tools like these to consume
new
>>>>> types of
>>>>>>>>    data, custom extensions are usually required. Rather than
create
>>>>> such
>>>>>>>>    extensions, Daffodil provides an easy and standards-compliant
way to
>>>>>>>>    transform data to XML or JSON, which many of these tools
already
>>>>>>>>    natively support.
>>>>>>>>
>>>>>>>>    == Known Risks ==
>>>>>>>>
>>>>>>>>    === Orphaned Products ===
>>>>>>>>
>>>>>>>>    The current core developers are the leading contributors
in the
>>>>> space of
>>>>>>>>    DFDL and wish to see it flourish. Though there is some
risk that the
>>>>>>>>    initial committers all come from the same company, a goal
of
>>>>> entering
>>>>>>>>    into incubation is to grow the development community to
minimize the
>>>>>>>>    risk of reliance on a single company.
>>>>>>>>
>>>>>>>>    === Inexperience with Open Source ===
>>>>>>>>
>>>>>>>>    The Daffodil project began as an open source project and
has
>>>>> continued
>>>>>>>>    that model throughout development. This includes public
bug
>>>>> tracking,
>>>>>>>>    git revision control, automated builds and tests, and
a public wiki
>>>>> for
>>>>>>>>    documentation.
>>>>>>>>
>>>>>>>>    Additionally, the current core developers and initial
committers all
>>>>>>>>    work for a company that relies on, believes in, promotes,
and has
>>>>> led or
>>>>>>>>    contributed to many open source software projects, including
SELinux
>>>>>>>>    Userspace, OpenSCAP, CLIP, refpolicy, setools, RPM, and
others. As
>>>>> such,
>>>>>>>>    there is low risk related to inexperience with open source
software
>>>>> and
>>>>>>>>    processes.
>>>>>>>>
>>>>>>>>    === Homogeneous Developers ===
>>>>>>>>
>>>>>>>>    The proposed initial committers come from a single entity,
though
>>>>> we are
>>>>>>>>    committed to growing the Daffodil development community
to include a
>>>>>>>>    broad group of additional committers from a wide array
of
>>>>> industries.
>>>>>>>>
>>>>>>>>    === Reliance on Salaried Developers ===
>>>>>>>>
>>>>>>>>    The proposed initial committers are paid by their employer
to
>>>>> contribute
>>>>>>>>    to the Daffodil project. We expect that Daffodil development
will
>>>>>>>>    continue with salaried developers, and are committed to
growing the
>>>>>>>>    community to include non-salaried developers as well.
>>>>>>>>
>>>>>>>>    === Relationship with other Apache Projects ===
>>>>>>>>
>>>>>>>>    As mentioned in the Alignment section, Daffodil fills
a clear gap in
>>>>>>>>    numerous other ASF projects that consume and manage large
amounts
>>>>> of data.
>>>>>>>>
>>>>>>>>    As a specific example, Daffodil developers have created
a Daffodil
>>>>>>>>    Apache !NiFi Processor, currently in use in data transfer
solutions,
>>>>>>>>    which allows one to ingest non-native data into an Apache
!NiFi
>>>>> pipeline
>>>>>>>>    as XML or JSON. This processor was well received by the
Apache !NiFi
>>>>>>>>    developers, with positive comments about the concise API
and how it
>>>>>>>>    could handle non-native data. Daffodil developers have
also
>>>>> successfully
>>>>>>>>    prototyped integration with Apache Spark. We believe Daffodil
could
>>>>>>>>    provide a strong benefit to many other ASF projects that
handle
>>>>> fixed
>>>>>>>>    format data. We anticipate working closely with such ASF
projects to
>>>>>>>>    include Daffodil where applicable to increase their ability
to
>>>>> support
>>>>>>>>    new data formats with minimal effort.
>>>>>>>>
>>>>>>>>    Daffodil also depends on existing ASF projects, including
Apache
>>>>> Commons
>>>>>>>>    and Apache Xerces.
>>>>>>>>
>>>>>>>>    === An Excessive Fascination with the Apache Brand ===
>>>>>>>>
>>>>>>>>    Although the Apache brand may certainly help to attract
more
>>>>>>>>    contributors, publicity is not the reason for this proposal.
We
>>>>> believe
>>>>>>>>    Daffodil could provide a great benefit to the ASF and
the numerous
>>>>> data
>>>>>>>>    focused projects that comprise it, as described in the
Rationale and
>>>>>>>>    Alignment sections. We hope to build a strong and vibrant
community
>>>>>>>>    built around The Apache Way, and not dependent on a single
company.
>>>>>>>>
>>>>>>>>    === Documentation ===
>>>>>>>>
>>>>>>>>    Daffodil documentation can be found at:
>>>>>>>>
>>>>>>>>    *
>>>>>>>>    https://opensource.ncsa.illinois.edu/confluence/
>>>>>>>>    display/DFDL/Daffodil%3A+Open+Source+DFDL
>>>>>>>>
>>>>>>>>    Information about DFDL can be found at:
>>>>>>>>
>>>>>>>>    * https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
>>>>>>>>    *
>>>>>>>>    https://www.ibm.com/support/knowledgecenter/en/SSMKHH_9.0.
>>>>>>>>    0/com.ibm.etools.mft.doc/df20060_.htm
>>>>>>>>
>>>>>>>>    Public examples of DFDL Schemas can be found at:
>>>>>>>>
>>>>>>>>    * https://github.com/DFDLSchemas
>>>>>>>>
>>>>>>>>    == Initial Source ==
>>>>>>>>
>>>>>>>>    The Daffodil git repo goes back to mid-2011 with approximately
20
>>>>>>>>    different contributors and feedback from many users and
developers.
>>>>> The
>>>>>>>>    core codebase is written in Scala and includes both a
Scala and Java
>>>>>>>>    API, along with Javadocs and Scaladocs for API usage.
The initial
>>>>> code
>>>>>>>>    will come from the git repository currently hosted by
NCSA at the
>>>>>>>>    University of Illinois :
>>>>>>>>
>>>>>>>>    https://opensource.ncsa.illinois.edu/bitbucket/
>>>>>>>>    projects/DFDL/repos/daffodil/
>>>>>>>>
>>>>>>>>    == Source and Intellectual Property Submission ==
>>>>>>>>
>>>>>>>>    The complete Daffodil code is licensed under the University
of
>>>>>>>>    Illinois/NCSA Open Source License. Much of the current
codebase has
>>>>> been
>>>>>>>>    developed by Tresys Technology, who is open to relicensing
the code
>>>>> to
>>>>>>>>    the Apache License version 2.0 and donate the source to
the ASF.
>>>>>>>>    Contacts at NCSA are also open to relicensing their contributions
to
>>>>>>>>    Apache v2. We plan to contact the other contributors and
ask for
>>>>>>>>    permission to relicense and donate their contributed code.
For those
>>>>>>>>    that decline or we cannot contact, their code will be
removed or
>>>>>>>>    replaced. We will work closely with Apache Legal to ensure
all
>>>>> issues
>>>>>>>>    related to relicensing are acceptable.
>>>>>>>>
>>>>>>>>    == External Dependencies ==
>>>>>>>>
>>>>>>>>    We believe all current dependencies are compatible with
the ASF
>>>>>>>>    guidelines. Our dependency licenses come from the following
license
>>>>>>>>    styles: Apache v2, BSD, MIT, and ICU. The list of current
Daffodil
>>>>>>>>    dependencies and their licenses are documented here:
>>>>>>>>
>>>>>>>>    https://opensource.ncsa.illinois.edu/confluence/
>>>>>>>>    display/DFDL/Dependencies+and+Licenses
>>>>>>>>
>>>>>>>>    == Cryptography ==
>>>>>>>>
>>>>>>>>    None
>>>>>>>>
>>>>>>>>    == Required Resources ==
>>>>>>>>
>>>>>>>>    === Mailing Lists ===
>>>>>>>>
>>>>>>>>    * commits@daffodil.incubator.apache.org
>>>>>>>>    * dev@daffodil.incubator.apache.org
>>>>>>>>    * private@daffodil.incubator.apache.org
>>>>>>>>    * user@daffodil.incubator.apache.org
>>>>>>>>
>>>>>>>>    === Source Control ===
>>>>>>>>
>>>>>>>>    git://git.apache.org/incubator-daffodil.git
>>>>>>>>
>>>>>>>>    === Issue Tracking ===
>>>>>>>>
>>>>>>>>    JIRA Daffodil (DFDL)
>>>>>>>>
>>>>>>>>    === Initial Committers ===
>>>>>>>>
>>>>>>>>    * Beth Finnegan <efinnegan at tresys dot com>
>>>>>>>>    * Dave Thompson <dthompson at tresys dot com>
>>>>>>>>    * Josh Adams <jadams at tresys dot com>
>>>>>>>>    * Mike Beckerle <mbeckerle at tresys dot com>
>>>>>>>>    * Steve Lawrence <slawrence at tresys dot com>
>>>>>>>>    * Taylor Wise <twise at tresys dot com>
>>>>>>>>
>>>>>>>>    === Affiliations ===
>>>>>>>>
>>>>>>>>    * Beth Finnegan (Tresys Technology)
>>>>>>>>    * Dave Thompson (Tresys Technology)
>>>>>>>>    * Josh Adams (Tresys Technology)
>>>>>>>>    * Mike Beckerle (Tresys Technology)
>>>>>>>>    * Steve Lawrence (Tresys Technology)
>>>>>>>>    * Taylor Wise (Tresys Technology)
>>>>>>>>
>>>>>>>>    == Sponsors ==
>>>>>>>>
>>>>>>>>    === Champion ===
>>>>>>>>
>>>>>>>>    * TBD
>>>>>>>>
>>>>>>>>    === Nominated Mentors ===
>>>>>>>>
>>>>>>>>    * TBD
>>>>>>>>
>>>>>>>>    === Sponsoring Entity ===
>>>>>>>>
>>>>>>>>    We request the Apache Incubator to sponsor this project.
>>>>>>>>
>>>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>>>>    To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>>>>>>    For additional commands, e-mail: general-help@incubator.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>>>>    To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>>> <mailto:general-unsubscribe@incubator.apache.org>
>>>>>>>>    For additional commands, e-mail: general-help@incubator.apache.org
>>>>> <mailto:general-help@incubator.apache.org>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>>>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
>>> For additional commands, e-mail: general-help@incubator.apache.org
>>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
For additional commands, e-mail: general-help@incubator.apache.org


Mime
View raw message