flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@apache.org>
Subject Re: Some ideas for long-term Flink-related research and implementation projects
Date Fri, 20 Jun 2014 20:02:52 GMT
I'm still +1 for a wiki.


2014-06-20 21:49 GMT+02:00 Henry Saputra <henry.saputra@gmail.com>:

> Last email thread was not closed whether we want wiki or not. Seems like it
> is good idea to have wiki, at least for now, to share ideas like this.
>
> - Henry
>
> On Friday, June 20, 2014, Robert Metzger <rmetzger@apache.org> wrote:
>
> > Thank you for writing down the ideas.
> >
> > I think we should not open JIRAs for these ideas. I would rather prefer
> to
> > put the list on the website or a wiki (once we have that).
> >
> >
> > On Fri, Jun 20, 2014 at 6:25 PM, Kostas Tzoumas <
> > kostas.tzoumas@tu-berlin.de <javascript:;>
> > > wrote:
> >
> > > Hi Folks,
> > >
> > > After talking with Stephan, Fabian, Robert, and Ufuk, we gathered a few
> > > project ideas that people have been throwing around. These do not
> > > immediately classify as issues as they are major extensions of Flink
> > (some
> > > might classify as completely different projects). These would make nice
> > > standalone implementation projects, for example for University theses.
> > Some
> > > of them also require research and architecture work.
> > >
> > > The relevance to this mailing list is that perhaps someone is
> interested
> > in
> > > picking up such a project.
> > >
> > > Here is the idea dump:
> > >
> > > ---------------
> > >
> > > Domain-specific language for graph processing: Create a GraphDataSet
> that
> > > abstracts away the internal representation of a graph and operations on
> > the
> > > GraphDataSet. The project involves gathering requirements for graph
> > > processing functionality, architecting the DSL, implementation, and
> > > possible work on optimizing the operations when a graph operation can
> be
> > > mapped to different DataSet to DataSet transformations.
> > >
> > > Distributed mutable state: Currently delta iterations use internally a
> > hash
> > > index to store the state of the iteration, and they invoke index
> merging
> > > functionality. One idea would be to surface an operator (with care) to
> > the
> > > APIs that essentially allows mutable state manipulations. Another idea
> > > would be to implement something along the lines of a parameter server
> and
> > > make such functionality accessible to the APIs.
> > >
> > > Domain-specific language for spatial data: Create spatial data types
> > > (point, region, etc) and operations thereof
> > >
> > > Integration into Apache BigTop
> > >
> > > Integration with Apache Ambari
> > >
> > > Pig frontend for Flink: An initial effort was here:
> > > http://kth.diva-portal.org/smash/get/diva2:539046/FULLTEXT01.pdf
> > >
> > > Cascading on Flink
> > >
> > > Optimizing the integration with columnar file formats (Parquet,
> ORCFile)
> > > and perhaps eventually pushing filters down to data scans.
> > >
> > > Statistical operators to extract statistical information from a DataSet
> > > (e.g., histograms of value distributions)
> > >
> > > Integration with Apache Mahout (ongoing effort)
> > >
> > > Integration with Apache Tez (ongoing effort)
> > >
> > > Flink Streaming (ongoing effort)
> > >
> > > Eclipse plugin that includes functionality for execution plan debugging
> > >
> > > Local execution of programs using Java Collections
> > >
> > > ---------------
> > >
> > > Feel free to extend the descriptions that are empty and to extend this
> > > list.
> > >
> > > Do you think that these would qualify as JIRA tickets classified as
> > > "wishes"?
> > >
> > > Kostas
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message