incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John D. Ament" <johndam...@apache.org>
Subject Re: [DISCUSS] Daffodil Incubation Proposal
Date Thu, 10 Aug 2017 02:19:42 GMT
Steve,

You could list either of us.

John

On Wed, Aug 9, 2017 at 11:55 AM Steve Lawrence <stephen.d.lawrence@gmail.com>
wrote:

> Sounds good to me. Can I start a vote, or is something a champion/mentor
> would normally start? The project also does not have a champion--is that
> necessary/would either of you be interested in being the champion?
>
> Thanks,
> - Steve
>
> On 08/08/2017 10:59 PM, Dave Fisher wrote:
> > Hi -
> >
> > I agree. I'm willing to proceed with John and I as Mentors.
> >
> > Regards,
> > Dave
> >
> > Sent from my iPhone
> >
> >> On Aug 8, 2017, at 7:10 PM, John D. Ament <johndament@apache.org>
> wrote:
> >>
> >> Steve,
> >>
> >> At this point, I'd recommend we wrap the discussion and call for a
> vote.  While ideally we want 3 mentors, we can get started with 2 and see
> how things progress.
> >>
> >> John
> >>
> >>> On Wed, Aug 2, 2017 at 3:55 PM Steve Lawrence <
> stephen.d.lawrence@gmail.com> wrote:
> >>> Thanks John!
> >>>
> >>> On 08/02/2017 03:23 PM, John D. Ament wrote:
> >>>> You can also count me in as a mentor.
> >>>>
> >>>> John
> >>>>
> >>>> On Wed, Aug 2, 2017 at 3:14 PM Steve Lawrence <
> stephen.d.lawrence@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Understood. Thanks for the interest!
> >>>>>
> >>>>> - Steve
> >>>>>
> >>>>> On 08/02/2017 02:57 PM, Dave Fisher wrote:
> >>>>>> Hi Steve,
> >>>>>>
> >>>>>> It was not so much the lack of committers as it was the current
> >>>>> diversity. That is not a blocker for entry to Incubation.
> >>>>>>
> >>>>>> I am willing to be one of the Mentors. Once there are at least
two
> more
> >>>>> we can push forward.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Dave
> >>>>>>
> >>>>>>> On Aug 1, 2017, at 5:09 AM, Steve Lawrence <
> >>>>> stephen.d.lawrence@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Discussions have died down, and I think the consensus from
the
> responses
> >>>>>>> is that the issues are 1) the lack of committers and 2)
the lack
> of a
> >>>>>>> champion and mentors. We hope to address #1 and grow the
community
> as
> >>>>>>> part of incubation. Is anyone interested in being a champion
or
> mentor
> >>>>>>> and help us with #2?
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> - Steve
> >>>>>>>
> >>>>>>> On 07/26/2017 04:06 PM, Chris Mattmann wrote:
> >>>>>>>> This sounds like a very interesting project.
> >>>>>>>>
> >>>>>>>> I don’t have the time to mentor at the moment but
I will keep a
> close
> >>>>> eye on it.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Chris Mattmann
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 7/25/17, 11:53 AM, "McHenry, Kenton Guadron" <
> mchenry@illinois.edu>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>>    Hi Dave,
> >>>>>>>>
> >>>>>>>>    The developers that were at NCSA have moved on to
other
> >>>>> organizations.  While we still leverage Daffodil and are very much
> >>>>> interested in seeing it move forward, development is currently done
> by the
> >>>>> Tresys team.  Agreed on the synergy with Tika.
> >>>>>>>>
> >>>>>>>>    Kenton McHenry, Ph.D.
> >>>>>>>>    Principal Research Scientist, Adjunct Assistant Professor
of
> >>>>> Computer Science
> >>>>>>>>    Deputy Director of the Scientific Software &
Applications
> Division
> >>>>>>>>    National Center for Supercomputing Applications,
University of
> >>>>> Illinois at Urbana-Champaign
> >>>>>>>>
> >>>>>>>>    On Jul 24, 2017, at 1:55 PM, Dave Fisher <
> dave2wave@comcast.net
> >>>>> <mailto:dave2wave@comcast.net>> wrote:
> >>>>>>>>
> >>>>>>>>    Hi Kenton,
> >>>>>>>>
> >>>>>>>>    Is there any reason that you and others from the
NCSA are not
> >>>>> Initial Committers? That would make this proposal stronger.
> >>>>>>>>
> >>>>>>>>    Regarding Apache Tika - it relies on other projects
including
> >>>>> Apache POI and Apache PDFBox. They are pragmatic about what is used.
> If
> >>>>> Daffodil works to expand then I think that there would be good
> synergy
> >>>>> between the projects. I know as a POI PMC member that the POI
> community has
> >>>>> significantly benefited from the Tika community some of whom are
> from Mitre.
> >>>>>>>>
> >>>>>>>>    To date Tika has not emphasized structured data,
although they
> do
> >>>>> extract content from Excel and OpenOffice.
> >>>>>>>>
> >>>>>>>>    I am intrigued.
> >>>>>>>>
> >>>>>>>>    Regards,
> >>>>>>>>    Dave
> >>>>>>>>
> >>>>>>>>    On Jul 24, 2017, at 10:55 AM, McHenry, Kenton Guadron
<
> >>>>> mchenry@illinois.edu<mailto:mchenry@illinois.edu>> wrote:
> >>>>>>>>
> >>>>>>>>    Yes, DFDL and its open source implementation Daffodil
are more
> >>>>> about file formats and getting access to the entirety of a file's
> contents
> >>>>> in a consistent way through machine readable specifications.  The
> work has
> >>>>> implications in the area of digital preservation allowing one to
> preserve
> >>>>> these machine readable specifications rather than all the tools
> needed to
> >>>>> open/save a file in order to work with it.  Imagine someone
> developing
> >>>>> graphics software to work with 3D models and not having to worry
> about the
> >>>>> hundreds of formats out there for 3D meshes (whether there are tools
> for
> >>>>> opening the files and whether they can get access to those tools,
> whether
> >>>>> the spec is available and worrying about how complex that spec is
to
> >>>>> implement, etc.), and simply building their code around the contents
> (e.g.
> >>>>> vertices, faces, etc.).  One could come up with similar scenarios
> for other
> >>>>> data types (documents, images, videos, audio, depth data, numeric
> data).
> >>>>> Ideally tools built supporting DFDL, could someday, support any
> format for
> >>>>> that type without the developer having to worry about the details
of
> how
> >>>>> that data is represented within a file.
> >>>>>>>>
> >>>>>>>>    Kenton McHenry, Ph.D.
> >>>>>>>>    Principal Research Scientist, Adjunct Assistant Professor
of
> >>>>> Computer Science
> >>>>>>>>    Deputy Director of the Scientific Software &
Applications
> Division
> >>>>>>>>    National Center for Supercomputing Applications,
University of
> >>>>> Illinois at Urbana-Champaign
> >>>>>>>>
> >>>>>>>>    On Jul 24, 2017, at 10:30 AM, Steve Lawrence <
> >>>>> stephen.d.lawrence@gmail.com<mailto:stephen.d.lawrence@gmail.com
> ><mailto:
> >>>>> stephen.d.lawrence@gmail.com>> wrote:
> >>>>>>>>
> >>>>>>>>    I'll preface this saying that I don't have a ton
of experience
> with
> >>>>>>>>    Apache Tika. But based on my understanding, Tika
and Daffodil
> do
> >>>>> have
> >>>>>>>>    somewhat similar goals, but reach them in different
ways. For
> >>>>> example,
> >>>>>>>>    Tika requires that one writes /code/ to perform data
> extraction,
> >>>>> usually
> >>>>>>>>    relying on existing Java libraries to extract the
desired
> metadata.
> >>>>> The
> >>>>>>>>    downside to this is that code can be buggy, and libraries
> might not
> >>>>> even
> >>>>>>>>    exist for formats of interest (especially common
with legacy
> and
> >>>>>>>>    military data).
> >>>>>>>>
> >>>>>>>>    Daffodil, on the other hand, does not require one
to write any
> code.
> >>>>>>>>    Instead, one writes a DFDL Schema (similar to XML
Schema, with
> DFDL
> >>>>>>>>    annotations) that fully describes the data, which
Daffodil then
> >>>>> uses to
> >>>>>>>>    convert the data to XML/JSON for extraction. So adding
support
> for
> >>>>> a new
> >>>>>>>>    format means writing a new schema rather than new
code. And
> less
> >>>>> code
> >>>>>>>>    generally means less bugs. Also, for secure systems
that
> require
> >>>>>>>>    certification, generally speaking, it is easier to
certify a
> schema
> >>>>> as
> >>>>>>>>    compared to code.
> >>>>>>>>
> >>>>>>>>    We certainly don't believe that Daffodil could replace
Tika,
> but it
> >>>>> does
> >>>>>>>>    have the potential to add new functionality to Tika
for formats
> >>>>> that do
> >>>>>>>>    not have existing libraries. One of our goals is
to look into
> >>>>>>>>    integrating Daffodil support into tools like Tika.
We'd love
> to hear
> >>>>>>>>    from Tika devs if this is something they'd be interested
in.
> >>>>>>>>
> >>>>>>>>    I'll also add that whereas Tika tends to focus primarily
on
> >>>>> metadata,
> >>>>>>>>    DFDL schemas usually describe an entire file format
down to the
> >>>>> byte, so
> >>>>>>>>    one can extract more than just meta data, including
text and
> binary
> >>>>>>>>    data. Further differentiating, Daffodil has support
for
> serializing
> >>>>> data
> >>>>>>>>    (called unparse) from the XML/JSON representation,
allowing
> one to
> >>>>>>>>    transform or filter data as well. We don't believe
this
> feature is
> >>>>> all
> >>>>>>>>    that applicable to Tika, but may be useful to other
> technologies
> >>>>> such as
> >>>>>>>>    filtering or data fuzzing technologies.
> >>>>>>>>
> >>>>>>>>    - Steve
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>    On 07/24/2017 10:59 AM, Mike Drob wrote:
> >>>>>>>>    What is the relationship between Daffodil and something
like
> Apache
> >>>>> Tika's
> >>>>>>>>    extraction engine?
> >>>>>>>>
> >>>>>>>>    On Mon, Jul 24, 2017 at 9:53 AM, Steve Lawrence <
> >>>>>>>>    stephen.d.lawrence@gmail.com<mailto:
> stephen.d.lawrence@gmail.com
> >>>>>> <mailto:stephen.d.lawrence@gmail.com>> wrote:
> >>>>>>>>
> >>>>>>>>    Dear Apache Incubator Community,
> >>>>>>>>
> >>>>>>>>    We would like to start a discussion around a proposal
to bring
> >>>>> Daffodil
> >>>>>>>>    into the Apache Incubator. Daffodil is a implementation
of the
> DFDL
> >>>>>>>>    specification used to convert between fixed format
data and
> >>>>> XML/JSON.
> >>>>>>>>
> >>>>>>>>    The draft proposal can be found in the wiki at the
following
> URL:
> >>>>>>>>
> >>>>>>>>    https://wiki.apache.org/incubator/DaffodilProposal
> >>>>>>>>
> >>>>>>>>    We do not yet have a champion or mentors, but it
was
> recommended
> >>>>> that we
> >>>>>>>>    create a proposal and send it to this list to potentially
find
> those
> >>>>>>>>    that might be interested. The text for the draft
proposal is
> found
> >>>>>>>>    below. We look forward to your input.
> >>>>>>>>
> >>>>>>>>    Thanks,
> >>>>>>>>    -Steve
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>    = Daffodil Proposal =
> >>>>>>>>
> >>>>>>>>    == Abstract ==
> >>>>>>>>
> >>>>>>>>    Daffodil is an implementation of the Data Format
Description
> >>>>> Language
> >>>>>>>>    (DFDL) used to convert between fixed format data
and XML/JSON.
> >>>>>>>>
> >>>>>>>>    == Proposal ==
> >>>>>>>>
> >>>>>>>>    The Data Format Description Language (DFDL) is a
specification,
> >>>>>>>>    developed by the Open Grid Forum, capable of describing
many
> data
> >>>>>>>>    formats, including both textual and binary, scientific
and
> numeric,
> >>>>>>>>    legacy and modern, commercial record-oriented, and
many
> industry and
> >>>>>>>>    military standards. It defines a language that is
a subset of
> W3C
> >>>>> XML
> >>>>>>>>    schema to describe the logical format of the data,
and
> annotations
> >>>>>>>>    within the schema to describe the physical representation.
> >>>>>>>>
> >>>>>>>>    Daffodil is an open source implementation of the
DFDL
> specification
> >>>>> that
> >>>>>>>>    uses these DFDL schemas to parse fixed format data
into an
> infoset,
> >>>>>>>>    which is most commonly represented as either XML
or JSON. This
> >>>>> allows
> >>>>>>>>    the use of well-established XML or JSON technologies
and
> libraries
> >>>>> to
> >>>>>>>>    consume, inspect, and manipulate fixed format data
in existing
> >>>>>>>>    solutions. Daffodil is also capable of the reverse
by
> serializing or
> >>>>>>>>    "unparsing" an XML or JSON infoset back to the original
data
> format.
> >>>>>>>>
> >>>>>>>>    == Background ==
> >>>>>>>>
> >>>>>>>>    Many different software solutions need to consume
and manage
> data,
> >>>>>>>>    including data directed routing, databases, data
analysis, data
> >>>>>>>>    cleansing, data visualizing, and more. A key aspect
of such
> >>>>> solutions is
> >>>>>>>>    the need to transform the data into an easily consumable
> format.
> >>>>>>>>    Usually, this means that for each unique data format,
one
> develops a
> >>>>>>>>    tool that can read and extract the necessary information,
often
> >>>>> leading
> >>>>>>>>    to ad-hoc and data-format-specific description systems.
Such
> >>>>> systems are
> >>>>>>>>    often proprietary, not well tested, and incompatible,
leading
> to
> >>>>> vendor
> >>>>>>>>    lock-in, flawed software, and increased training
costs. DFDL
> is a
> >>>>> new
> >>>>>>>>    standard, with version 1.0 completed in October of
2016, that
> solves
> >>>>>>>>    these problems by defining an open standard to describe
many
> >>>>> different
> >>>>>>>>    data formats and how to parse and unparse between
the data and
> >>>>> XML/JSON.
> >>>>>>>>
> >>>>>>>>    Two closed source implementations of DFDL currently
exist. The
> >>>>> first was
> >>>>>>>>    created by IBM and is now part of their IBM® Integration
Bus
> >>>>> product.
> >>>>>>>>    The second was created by the European Space Agency,
called
> DFDL4S
> >>>>> or
> >>>>>>>>    "DFDL for Space" targeted at the challenges of their
satellite
> data
> >>>>>>>>    processing.
> >>>>>>>>
> >>>>>>>>    Around 2005, Pacific Northwest National Lab created
Defuddle,
> built
> >>>>> as
> >>>>>>>>    an open source implementation and proof of concept
of the
> draft DFDL
> >>>>>>>>    specification and a test bed to feed new concepts
into
> specification
> >>>>>>>>    development. Primary development of Defuddle was
eventually
> taken
> >>>>> over
> >>>>>>>>    by the National Center for Supercomputing Applications
(NCSA).
> >>>>> However,
> >>>>>>>>    due to evolution of the DFDL specification and architectural
> and
> >>>>>>>>    performance issues with Defuddle, around 2009, NCSA
restarted
> the
> >>>>>>>>    project with the new name of Daffodil, with a goal
of
> implementing
> >>>>> the
> >>>>>>>>    complete DFDL specification. Daffodil development
continued at
> NCSA
> >>>>>>>>    until around 2012, at which point development slowed
due to
> budget
> >>>>>>>>    limitations. Shortly thereafter, primary development
was
> picked up
> >>>>> by
> >>>>>>>>    Tresys Technology where it continues today, with
contributions
> from
> >>>>>>>>    other entities such as the Navy Research Lab, the
Air Force
> Research
> >>>>>>>>    Lab, MITRE, and Booz Allen Hamilton. In February
of 2015,
> Daffodil
> >>>>>>>>    version 1.0.0 was released, including support for
the DFDL
> features
> >>>>>>>>    needed to parse many common file formats. Daffodil
version
> 2.0.0 is
> >>>>>>>>    expected to be released in August of 2017, which
will include
> >>>>> unparse
> >>>>>>>>    support with one-to-one parsing feature parity.
> >>>>>>>>
> >>>>>>>>    Entities including IBM, MITRE, NATO NCI Agency,
> Northrop-Grumman,
> >>>>> Quark
> >>>>>>>>    Security, Raytheon, and Tresys Technology have developed
DFDL
> >>>>> schemas
> >>>>>>>>    for many data formats from varying technology domains,
> including
> >>>>> PNG,
> >>>>>>>>    GIF, BMP, PCAP, HL7, EDIFACT, NACHA, vCard, iCalendar,
and
> >>>>> MIL-STD-2045,
> >>>>>>>>    many of which are publicly available on the DFDL
Schemas
> github.
> >>>>> There
> >>>>>>>>    are also a number of military-application data formats,
the
> >>>>>>>>    specifications of which are not public, which have
> historically been
> >>>>>>>>    very difficult and expensive to process, and for
which DFDL
> schemas
> >>>>> have
> >>>>>>>>    been created or are actively in development; these
include
> >>>>>>>>    MIL-STD-6040/USMTF ATO, MIL-STD-6017/VMF, MIL-STD-6016/NATO
> STANAG
> >>>>> 5516
> >>>>>>>>    (aka "Link16").
> >>>>>>>>
> >>>>>>>>    == Rationale ==
> >>>>>>>>
> >>>>>>>>    Numerous software solutions exist that consume, inspect,
> analyze,
> >>>>> and
> >>>>>>>>    transform data, many of which can be found in the
Apache
> Software
> >>>>>>>>    Foundation (ASF). In order for tools like these to
consume new
> >>>>> types of
> >>>>>>>>    data, custom extensions are usually required, often
with high
> >>>>>>>>    development and testing costs. Daffodil fills a clear
gap in
> many of
> >>>>>>>>    these solutions, providing a simple and low cost
way to
> transform
> >>>>> data
> >>>>>>>>    to XML or JSON, which many of these tools natively
support
> already.
> >>>>> With
> >>>>>>>>    the upcoming 2.0.0 release, the Daffodil project
will have
> achieved
> >>>>> a
> >>>>>>>>    level of functionality in both parse and unparse
that, when
> >>>>> integrated
> >>>>>>>>    into existing solutions, could provide for a new
method to
> quickly
> >>>>>>>>    enable support for new data formats.
> >>>>>>>>
> >>>>>>>>    == Initial Goals ==
> >>>>>>>>
> >>>>>>>>    * Relicense the existing code from the University
of
> Illinois/NCSA
> >>>>> Open
> >>>>>>>>    Source License to the Apache License version 2.0,
working with
> >>>>> Apache
> >>>>>>>>    Legal to ensure correctness, and with Daffodil contributors
to
> get
> >>>>>>>>    their permission.
> >>>>>>>>    * Move the existing codebase, documentation, bugs,
and mailing
> >>>>> lists to
> >>>>>>>>    the Apache hosted infrastructure
> >>>>>>>>    * Establish a formal release process and schedule,
allowing for
> >>>>>>>>    dependable release cycles in a manner consistent
with the
> Apache
> >>>>>>>>    development process.
> >>>>>>>>    * Build relationships with ASF projects to add Daffodil
support
> >>>>> where
> >>>>>>>>    appropriate
> >>>>>>>>    * Grow the community to establish a diversity of
background and
> >>>>> expertise.
> >>>>>>>>
> >>>>>>>>    == Current Status ==
> >>>>>>>>
> >>>>>>>>    === Meritocracy ===
> >>>>>>>>
> >>>>>>>>    All initial committers are familiar with the principles
of
> >>>>> meritocracy.
> >>>>>>>>    The Daffodil project has followed the model of meritocracy
in
> the
> >>>>> past,
> >>>>>>>>    providing multiple outside entities commit access
based on the
> >>>>> quality
> >>>>>>>>    of their contributions. In order to grow the Daffodil
user
> base and
> >>>>>>>>    development community, we are dedicated to continuing
to
> operate
> >>>>>>>>    Daffodil as a meritocracy.
> >>>>>>>>
> >>>>>>>>    A key ingredient in a meritocracy of developers is
open group
> code
> >>>>>>>>    review. The Daffodil project has operated in this
mode
> throughout
> >>>>> its
> >>>>>>>>    existence and this provides a forum to improve the
code,
> verify code
> >>>>>>>>    quality, and educate new developers on the code base.
> >>>>>>>>
> >>>>>>>>    === Community ===
> >>>>>>>>
> >>>>>>>>    Daffodil has a small community of users and developers.
> Although
> >>>>> primary
> >>>>>>>>    Daffodil development is done by Tresys Technology,
a handful of
> >>>>> other
> >>>>>>>>    contributions have come from other entities including
the Navy
> >>>>> Research
> >>>>>>>>    Lab, the Air Force Research Lab, MITRE, and Booz
Allen
> Hamilton. In
> >>>>>>>>    addition to developers, multiple users of Daffodil
have
> created DFDL
> >>>>>>>>    schemas, including entities such as MITRE, IBM, Raytheon,
Quark
> >>>>>>>>    Security, and Tresys Technology. The DFDL Schemas
github
> community
> >>>>> has
> >>>>>>>>    been created as a place for DFDL schemas to be published.
The
> >>>>> Daffodil
> >>>>>>>>    project also makes use of mailing lists, !HipChat,
and
> Confluence
> >>>>>>>>    Questions to build a community of users and system
for support.
> >>>>>>>>
> >>>>>>>>    === Core Developers ===
> >>>>>>>>
> >>>>>>>>    The core developers of Daffodil are employed by Tresys
> Technology.
> >>>>> We
> >>>>>>>>    will work to grow the community among a more diverse
set of
> >>>>> developers
> >>>>>>>>    and industries.
> >>>>>>>>
> >>>>>>>>    === Alignment ===
> >>>>>>>>
> >>>>>>>>    Daffodil was created as an open source project with
a
> philosophy
> >>>>>>>>    consistent with The Apache Way. A strong belief in
meritocracy,
> >>>>>>>>    community involvement in decisions, openness, and
ensuring a
> high
> >>>>> level
> >>>>>>>>    of quality in code, documentation, and testing are
some of our
> >>>>> shared
> >>>>>>>>    core beliefs.
> >>>>>>>>
> >>>>>>>>    Further, as mentioned in the Rationale section, Daffodil
fills
> a gap
> >>>>>>>>    that exists in many ASF projects, including !NiFi,
Spark,
> Storm,
> >>>>> Hadoop,
> >>>>>>>>    Tika, and others. In order for tools like these to
consume new
> >>>>> types of
> >>>>>>>>    data, custom extensions are usually required. Rather
than
> create
> >>>>> such
> >>>>>>>>    extensions, Daffodil provides an easy and standards-compliant
> way to
> >>>>>>>>    transform data to XML or JSON, which many of these
tools
> already
> >>>>>>>>    natively support.
> >>>>>>>>
> >>>>>>>>    == Known Risks ==
> >>>>>>>>
> >>>>>>>>    === Orphaned Products ===
> >>>>>>>>
> >>>>>>>>    The current core developers are the leading contributors
in the
> >>>>> space of
> >>>>>>>>    DFDL and wish to see it flourish. Though there is
some risk
> that the
> >>>>>>>>    initial committers all come from the same company,
a goal of
> >>>>> entering
> >>>>>>>>    into incubation is to grow the development community
to
> minimize the
> >>>>>>>>    risk of reliance on a single company.
> >>>>>>>>
> >>>>>>>>    === Inexperience with Open Source ===
> >>>>>>>>
> >>>>>>>>    The Daffodil project began as an open source project
and has
> >>>>> continued
> >>>>>>>>    that model throughout development. This includes
public bug
> >>>>> tracking,
> >>>>>>>>    git revision control, automated builds and tests,
and a public
> wiki
> >>>>> for
> >>>>>>>>    documentation.
> >>>>>>>>
> >>>>>>>>    Additionally, the current core developers and initial
> committers all
> >>>>>>>>    work for a company that relies on, believes in, promotes,
and
> has
> >>>>> led or
> >>>>>>>>    contributed to many open source software projects,
including
> SELinux
> >>>>>>>>    Userspace, OpenSCAP, CLIP, refpolicy, setools, RPM,
and
> others. As
> >>>>> such,
> >>>>>>>>    there is low risk related to inexperience with open
source
> software
> >>>>> and
> >>>>>>>>    processes.
> >>>>>>>>
> >>>>>>>>    === Homogeneous Developers ===
> >>>>>>>>
> >>>>>>>>    The proposed initial committers come from a single
entity,
> though
> >>>>> we are
> >>>>>>>>    committed to growing the Daffodil development community
to
> include a
> >>>>>>>>    broad group of additional committers from a wide
array of
> >>>>> industries.
> >>>>>>>>
> >>>>>>>>    === Reliance on Salaried Developers ===
> >>>>>>>>
> >>>>>>>>    The proposed initial committers are paid by their
employer to
> >>>>> contribute
> >>>>>>>>    to the Daffodil project. We expect that Daffodil
development
> will
> >>>>>>>>    continue with salaried developers, and are committed
to
> growing the
> >>>>>>>>    community to include non-salaried developers as well.
> >>>>>>>>
> >>>>>>>>    === Relationship with other Apache Projects ===
> >>>>>>>>
> >>>>>>>>    As mentioned in the Alignment section, Daffodil fills
a clear
> gap in
> >>>>>>>>    numerous other ASF projects that consume and manage
large
> amounts
> >>>>> of data.
> >>>>>>>>
> >>>>>>>>    As a specific example, Daffodil developers have created
a
> Daffodil
> >>>>>>>>    Apache !NiFi Processor, currently in use in data
transfer
> solutions,
> >>>>>>>>    which allows one to ingest non-native data into an
Apache !NiFi
> >>>>> pipeline
> >>>>>>>>    as XML or JSON. This processor was well received
by the Apache
> !NiFi
> >>>>>>>>    developers, with positive comments about the concise
API and
> how it
> >>>>>>>>    could handle non-native data. Daffodil developers
have also
> >>>>> successfully
> >>>>>>>>    prototyped integration with Apache Spark. We believe
Daffodil
> could
> >>>>>>>>    provide a strong benefit to many other ASF projects
that handle
> >>>>> fixed
> >>>>>>>>    format data. We anticipate working closely with such
ASF
> projects to
> >>>>>>>>    include Daffodil where applicable to increase their
ability to
> >>>>> support
> >>>>>>>>    new data formats with minimal effort.
> >>>>>>>>
> >>>>>>>>    Daffodil also depends on existing ASF projects, including
> Apache
> >>>>> Commons
> >>>>>>>>    and Apache Xerces.
> >>>>>>>>
> >>>>>>>>    === An Excessive Fascination with the Apache Brand
===
> >>>>>>>>
> >>>>>>>>    Although the Apache brand may certainly help to attract
more
> >>>>>>>>    contributors, publicity is not the reason for this
proposal. We
> >>>>> believe
> >>>>>>>>    Daffodil could provide a great benefit to the ASF
and the
> numerous
> >>>>> data
> >>>>>>>>    focused projects that comprise it, as described in
the
> Rationale and
> >>>>>>>>    Alignment sections. We hope to build a strong and
vibrant
> community
> >>>>>>>>    built around The Apache Way, and not dependent on
a single
> company.
> >>>>>>>>
> >>>>>>>>    === Documentation ===
> >>>>>>>>
> >>>>>>>>    Daffodil documentation can be found at:
> >>>>>>>>
> >>>>>>>>    *
> >>>>>>>>    https://opensource.ncsa.illinois.edu/confluence/
> >>>>>>>>    display/DFDL/Daffodil%3A+Open+Source+DFDL
> >>>>>>>>
> >>>>>>>>    Information about DFDL can be found at:
> >>>>>>>>
> >>>>>>>>    * https://www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> >>>>>>>>    *
> >>>>>>>>    https://www.ibm.com/support/knowledgecenter/en/SSMKHH_9.0.
> >>>>>>>>    0/com.ibm.etools.mft.doc/df20060_.htm
> >>>>>>>>
> >>>>>>>>    Public examples of DFDL Schemas can be found at:
> >>>>>>>>
> >>>>>>>>    * https://github.com/DFDLSchemas
> >>>>>>>>
> >>>>>>>>    == Initial Source ==
> >>>>>>>>
> >>>>>>>>    The Daffodil git repo goes back to mid-2011 with
approximately
> 20
> >>>>>>>>    different contributors and feedback from many users
and
> developers.
> >>>>> The
> >>>>>>>>    core codebase is written in Scala and includes both
a Scala
> and Java
> >>>>>>>>    API, along with Javadocs and Scaladocs for API usage.
The
> initial
> >>>>> code
> >>>>>>>>    will come from the git repository currently hosted
by NCSA at
> the
> >>>>>>>>    University of Illinois :
> >>>>>>>>
> >>>>>>>>    https://opensource.ncsa.illinois.edu/bitbucket/
> >>>>>>>>    projects/DFDL/repos/daffodil/
> >>>>>>>>
> >>>>>>>>    == Source and Intellectual Property Submission ==
> >>>>>>>>
> >>>>>>>>    The complete Daffodil code is licensed under the
University of
> >>>>>>>>    Illinois/NCSA Open Source License. Much of the current
> codebase has
> >>>>> been
> >>>>>>>>    developed by Tresys Technology, who is open to relicensing
the
> code
> >>>>> to
> >>>>>>>>    the Apache License version 2.0 and donate the source
to the
> ASF.
> >>>>>>>>    Contacts at NCSA are also open to relicensing their
> contributions to
> >>>>>>>>    Apache v2. We plan to contact the other contributors
and ask
> for
> >>>>>>>>    permission to relicense and donate their contributed
code. For
> those
> >>>>>>>>    that decline or we cannot contact, their code will
be removed
> or
> >>>>>>>>    replaced. We will work closely with Apache Legal
to ensure all
> >>>>> issues
> >>>>>>>>    related to relicensing are acceptable.
> >>>>>>>>
> >>>>>>>>    == External Dependencies ==
> >>>>>>>>
> >>>>>>>>    We believe all current dependencies are compatible
with the ASF
> >>>>>>>>    guidelines. Our dependency licenses come from the
following
> license
> >>>>>>>>    styles: Apache v2, BSD, MIT, and ICU. The list of
current
> Daffodil
> >>>>>>>>    dependencies and their licenses are documented here:
> >>>>>>>>
> >>>>>>>>    https://opensource.ncsa.illinois.edu/confluence/
> >>>>>>>>    display/DFDL/Dependencies+and+Licenses
> >>>>>>>>
> >>>>>>>>    == Cryptography ==
> >>>>>>>>
> >>>>>>>>    None
> >>>>>>>>
> >>>>>>>>    == Required Resources ==
> >>>>>>>>
> >>>>>>>>    === Mailing Lists ===
> >>>>>>>>
> >>>>>>>>    * commits@daffodil.incubator.apache.org
> >>>>>>>>    * dev@daffodil.incubator.apache.org
> >>>>>>>>    * private@daffodil.incubator.apache.org
> >>>>>>>>    * user@daffodil.incubator.apache.org
> >>>>>>>>
> >>>>>>>>    === Source Control ===
> >>>>>>>>
> >>>>>>>>    git://git.apache.org/incubator-daffodil.git
> >>>>>>>>
> >>>>>>>>    === Issue Tracking ===
> >>>>>>>>
> >>>>>>>>    JIRA Daffodil (DFDL)
> >>>>>>>>
> >>>>>>>>    === Initial Committers ===
> >>>>>>>>
> >>>>>>>>    * Beth Finnegan <efinnegan at tresys dot com>
> >>>>>>>>    * Dave Thompson <dthompson at tresys dot com>
> >>>>>>>>    * Josh Adams <jadams at tresys dot com>
> >>>>>>>>    * Mike Beckerle <mbeckerle at tresys dot com>
> >>>>>>>>    * Steve Lawrence <slawrence at tresys dot com>
> >>>>>>>>    * Taylor Wise <twise at tresys dot com>
> >>>>>>>>
> >>>>>>>>    === Affiliations ===
> >>>>>>>>
> >>>>>>>>    * Beth Finnegan (Tresys Technology)
> >>>>>>>>    * Dave Thompson (Tresys Technology)
> >>>>>>>>    * Josh Adams (Tresys Technology)
> >>>>>>>>    * Mike Beckerle (Tresys Technology)
> >>>>>>>>    * Steve Lawrence (Tresys Technology)
> >>>>>>>>    * Taylor Wise (Tresys Technology)
> >>>>>>>>
> >>>>>>>>    == Sponsors ==
> >>>>>>>>
> >>>>>>>>    === Champion ===
> >>>>>>>>
> >>>>>>>>    * TBD
> >>>>>>>>
> >>>>>>>>    === Nominated Mentors ===
> >>>>>>>>
> >>>>>>>>    * TBD
> >>>>>>>>
> >>>>>>>>    === Sponsoring Entity ===
> >>>>>>>>
> >>>>>>>>    We request the Apache Incubator to sponsor this project.
> >>>>>>>>
> >>>>>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>>>>>    To unsubscribe, e-mail:
> general-unsubscribe@incubator.apache.org
> >>>>>>>>    For additional commands, e-mail:
> general-help@incubator.apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>>>>>    To unsubscribe, e-mail:
> general-unsubscribe@incubator.apache.org
> >>>>> <mailto:general-unsubscribe@incubator.apache.org>
> >>>>>>>>    For additional commands, e-mail:
> general-help@incubator.apache.org
> >>>>> <mailto:general-help@incubator.apache.org>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> ---------------------------------------------------------------------
> >>>>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >>>>>>>> For additional commands, e-mail:
> general-help@incubator.apache.org
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >>>>>>> For additional commands, e-mail: general-help@incubator.apache.org
> >>>>>>
> >>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >>>>> For additional commands, e-mail: general-help@incubator.apache.org
> >>>>>
> >>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org
> >>> For additional commands, e-mail: general-help@incubator.apache.org
> >>>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message