spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "yuhao yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point
Date Fri, 30 Jun 2017 23:58:01 GMT

    [ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070883#comment-16070883
] 

yuhao yang commented on SPARK-20082:
------------------------------------

I'm OK with only supporting initialModel for Online LDA now. For EM LDA, an initial model
is also possible, but we may need some extra check depending on if EM can fit on new documents.

I'll make a pass on the current implementation. But we still need the opinion and final check
from [~josephkb] or other committers.

> Incremental update of LDA model, by adding initialModel as start point
> ----------------------------------------------------------------------
>
>                 Key: SPARK-20082
>                 URL: https://issues.apache.org/jira/browse/SPARK-20082
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>    Affects Versions: 2.1.0
>            Reporter: Mathieu DESPRIEE
>
> Some mllib models support an initialModel to start from and update it incrementally with
new data.
> From what I understand of OnlineLDAOptimizer, it is possible to incrementally update
an existing model with batches of new documents.
> I suggest to add an initialModel as a start point for LDA.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message