Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9F039200CC3 for ; Sat, 1 Jul 2017 01:58:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9DC1B160BF6; Fri, 30 Jun 2017 23:58:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E4593160BEB for ; Sat, 1 Jul 2017 01:58:05 +0200 (CEST) Received: (qmail 10128 invoked by uid 500); 30 Jun 2017 23:58:05 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 10119 invoked by uid 99); 30 Jun 2017 23:58:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jun 2017 23:58:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B31C2CEEC6 for ; Fri, 30 Jun 2017 23:58:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.211 X-Spam-Level: X-Spam-Status: No, score=-99.211 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id kW2KePIA5g5L for ; Fri, 30 Jun 2017 23:58:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8B6F95FBBB for ; Fri, 30 Jun 2017 23:58:03 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A811CE002B for ; Fri, 30 Jun 2017 23:58:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 0B23F245DC for ; Fri, 30 Jun 2017 23:58:01 +0000 (UTC) Date: Fri, 30 Jun 2017 23:58:01 +0000 (UTC) From: "yuhao yang (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-20082) Incremental update of LDA model, by adding initialModel as start point MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 30 Jun 2017 23:58:06 -0000 [ https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070883#comment-16070883 ] yuhao yang commented on SPARK-20082: ------------------------------------ I'm OK with only supporting initialModel for Online LDA now. For EM LDA, an initial model is also possible, but we may need some extra check depending on if EM can fit on new documents. I'll make a pass on the current implementation. But we still need the opinion and final check from [~josephkb] or other committers. > Incremental update of LDA model, by adding initialModel as start point > ---------------------------------------------------------------------- > > Key: SPARK-20082 > URL: https://issues.apache.org/jira/browse/SPARK-20082 > Project: Spark > Issue Type: New Feature > Components: ML > Affects Versions: 2.1.0 > Reporter: Mathieu DESPRIEE > > Some mllib models support an initialModel to start from and update it incrementally with new data. > From what I understand of OnlineLDAOptimizer, it is possible to incrementally update an existing model with batches of new documents. > I suggest to add an initialModel as a start point for LDA. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org