mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Palumbo <ap....@outlook.com>
Subject Re: Release
Date Wed, 18 Mar 2015 16:48:17 GMT
https://issues.apache.org/jira/browse/MAHOUT-1648?filter=-4&jql=project%20%3D%20MAHOUT%20AND%20status%20in%20%28Open%2C%20Reopened%29%20AND%20priority%20in%20%28Blocker%2C%20Critical%29%20ORDER%20BY%20createdDate%20DESC

On 03/18/2015 12:07 PM, Pat Ferrel wrote:
> only 6 issues mostly recent tickets including the things on the list
>
> no old bugs so once people see that, we’ll have a common measure of doneness
>
> 	•  MAHOUT-1648
> Update Mahout's CMS for 0.10.0
> 	•  MAHOUT-1647
> The release build is incomplete
> 	•  MAHOUT-1638
> H2O bindings fail at drmParallelizeWithRowLabels(...)
> 	•  MAHOUT-1586
> Downloads must have hashes
> 	•  MAHOUT-1522
> Handle logging levels via log4j.xml
> 	•  MAHOUT-1512
> Hadoop 2 compatibility
>
> On Mar 18, 2015, at 9:03 AM, Andrew Palumbo <ap.dev@outlook.com> wrote:
>
> Yeah makes sense- i don't think there are any Blocker legacy issues at the moment.
>
> On 03/18/2015 11:56 AM, Andrew Musselman wrote:
>> Yep
>>
>> On Wednesday, March 18, 2015, Andrew Palumbo <ap.dev@outlook.com> wrote:
>>
>>> Andrew- by the first block and second do you mean 1,2,3 for 0.10 and 3,4
>>> for 0.10.1?
>>>
>>> On 03/17/2015 08:26 PM, Shannon Quinn wrote:
>>>
>>>> +1
>>>>
>>>> On 3/17/15 8:19 PM, Andrew Musselman wrote:
>>>>
>>>>> How about 0.10 is the first block and 0.10.1 is the second?
>>>>>
>>>>> On Wed, Mar 18, 2015 at 1:12 AM, Andrew Palumbo <ap.dev@outlook.com>
>>>>> wrote:
>>>>>
>>>>>   I like this timeline... though mid April is coming up quickly.. Going
>>>>>> back
>>>>>> to Pat's list for 0.10.0:
>>>>>>
>>>>>>    1) refactor mrlegacy out of scala deps.
>>>>>>
>>>>>>> 2) build fixes for release.
>>>>>>> 3) docs — might be good to guinea-pig the new CMS with git
pubsub so we
>>>>>>> don’t have to do svn, not sure when that will be ready
>>>>>>>
>>>>>>>   I would add:
>>>>>>    4) Fix any remaining legacy bugs.
>>>>>>
>>>>>>> 5) docs, docs, docs
>>>>>>>
>>>>>>>   along with just some general cleanup.
>>>>>> Is anything else missing?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 03/17/2015 07:16 PM, Andrew Musselman wrote:
>>>>>>
>>>>>>   I'm good with that timing pending scope..
>>>>>>> On Wed, Mar 18, 2015 at 12:13 AM, Dmitriy Lyubimov <dlieu.7@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>    i was thinking 0.10.0 mid-april, update 0.10.1 end of spring.
>>>>>>>
>>>>>>>>     i would suggest feature extraction topics for 0.11.x.
Esp. w.r.t.
>>>>>>>> SchemaRDD aka DataFrame -- vectorizing, hashing, ML schema
support,
>>>>>>>> imputation of missing data, outlier cleanups etc. There's
a lot.
>>>>>>>>
>>>>>>>> Hardware backs integration -- i will certainly be looking
at those,
>>>>>>>> but perhaps the easiest is to start with automatic detection
and
>>>>>>>> configuration of capabilities via netlib, since it is already
in the
>>>>>>>> path and it seems likely that it will (eventually) support
cuda as
>>>>>>>> well in some form. This is for 0.11 or 0.12.x, depends on
>>>>>>>> availability.
>>>>>>>>
>>>>>>>> Higher order methods are somewhat a matter of inspiration.
I think i
>>>>>>>> could offer some stuff there too as I already have implemented
a lot
>>>>>>>> of those on top of Mahout before. I did bayesian optimization
(aka
>>>>>>>> "spearmint", GP-EI etc.) on Mahout algebra, line search,
(L)bfgs,
>>>>>>>> stats including Gaussian Process support. BFGS and line search
are
>>>>>>>> fairly simple methods and i will give a reference if anybody
is
>>>>>>>> interested. also, breeze also has line search with strong
wolfe
>>>>>>>> conditions (if a coded reference is needed). All that is
up for grabs
>>>>>>>> as a fairly well understood subject.
>>>>>>>>
>>>>>>>> (5-6 months out) Once GP-EI is available, it becomes a fairly
>>>>>>>> interesting topic to resurrect implicit feedback issue. Important
>>>>>>>> insight there is that in fact feature incoding can be done
by a custom
>>>>>>>> scheme (not necessarily using encoding schme done in paper;
in fact,
>>>>>>>> there are 2 of them there; or the way mllib encodes that
as well).
>>>>>>>> once custom encoding schemes are adjusted, using bayesian
optimization
>>>>>>>> is increasingly important, especially if there are more than
just 2
>>>>>>>> parameters there.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>


Mime
View raw message