spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Boesch <java...@gmail.com>
Subject Re: Spark.ml roadmap 2.3.0 and beyond
Date Tue, 20 Mar 2018 22:33:14 GMT
awesome thanks Joseph

2018-03-20 14:51 GMT-07:00 Joseph Bradley <joseph@databricks.com>:

> The promised roadmap JIRA: https://issues.apache.
> org/jira/browse/SPARK-23758
>
> Note it doesn't have much explicitly listed yet, but committers can add
> items as they agree to shepherd them.  (Committers, make sure to check what
> you're currently listed as shepherding!)  The links for searching can be
> useful too.
>
> On Thu, Dec 7, 2017 at 3:55 PM, Stephen Boesch <javadba@gmail.com> wrote:
>
>> Thanks Joseph.  We can wait for post 2.3.0.
>>
>> 2017-12-07 15:36 GMT-08:00 Joseph Bradley <joseph@databricks.com>:
>>
>>> Hi Stephen,
>>>
>>> I used to post those roadmap JIRAs to share instructions for
>>> contributing to MLlib and to try to coordinate amongst committers.  My
>>> feeling was that the coordination aspect was of mixed success, so I did not
>>> post one for 2.3.  I'm glad you pinged about this; if those were useful,
>>> then I can plan on posting one for the release after 2.3.  As far as
>>> identifying committers' plans, the best option right now is to look for
>>> Shepherds in JIRA as well as the few mailing list threads about directions.
>>>
>>> For myself, I'm mainly focusing on fixing some issues with persistence
>>> for custom algorithms in PySpark (done), adding the image schema (done),
>>> and using ML Pipelines in Structured Streaming (WIP).
>>>
>>> Joseph
>>>
>>> On Wed, Nov 29, 2017 at 6:52 AM, Stephen Boesch <javadba@gmail.com>
>>> wrote:
>>>
>>>> There are several  JIRA's and/or PR's that contain logic the Data
>>>> Science teams that I work with use in their local models. We are trying to
>>>> determine if/when these features may gain traction again.  In at least one
>>>> case all of the work were done but the shepherd said that getting it
>>>> committed were of lower priority than other tasks - one specifically
>>>> mentioned was the mllib/ml parity that has been ongoing for nearly three
>>>> years.
>>>>
>>>> In order to prioritize work that the ML platform would do it would be
>>>> helpful to know at least which if any of those tasks were going to be moved
>>>> ahead by the community: since we could then focus on other ones instead of
>>>> duplicating the effort.
>>>>
>>>> In addition there are some engineering code jam sessions that happen
>>>> periodically: knowing which features are actively on the roadmap would *certainly
>>>> *influence our selection of work.  The roadmaps from 2.2.0 and earlier
>>>> were a very good starting point to understand not just the specific work
in
>>>> progress - but also the current mindset/thinking of the committers in terms
>>>> of general priorities.
>>>>
>>>> So if the same format of document were not available - then what
>>>> content *is *that gives a picture of where spark.ml were headed?
>>>>
>>>> 2017-11-29 6:39 GMT-08:00 Stephen Boesch <javadba@gmail.com>:
>>>>
>>>>> Any further information/ thoughts?
>>>>>
>>>>>
>>>>>
>>>>> 2017-11-22 15:07 GMT-08:00 Stephen Boesch <javadba@gmail.com>:
>>>>>
>>>>>> The roadmaps for prior releases e.g. 1.6 2.0 2.1 2.2 were available:
>>>>>>
>>>>>> 2.2.0 https://issues.apache.org/jira/browse/SPARK-18813
>>>>>>
>>>>>> 2.1.0 https://issues.apache.org/jira/browse/SPARK-15581
>>>>>> ..
>>>>>>
>>>>>> It seems those roadmaps were not available per se' for 2.3.0 and
>>>>>> later? Is there a different mechanism for that info?
>>>>>>
>>>>>> stephenb
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Joseph Bradley
>>>
>>> Software Engineer - Machine Learning
>>>
>>> Databricks, Inc.
>>>
>>> [image: http://databricks.com] <http://databricks.com/>
>>>
>>
>>
>
>
> --
>
> Joseph Bradley
>
> Software Engineer - Machine Learning
>
> Databricks, Inc.
>
> [image: http://databricks.com] <http://databricks.com/>
>

Mime
View raw message