Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0850C185C1 for ; Tue, 1 Mar 2016 14:55:21 +0000 (UTC) Received: (qmail 12034 invoked by uid 500); 1 Mar 2016 14:55:20 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 11947 invoked by uid 500); 1 Mar 2016 14:55:20 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 11937 invoked by uid 99); 1 Mar 2016 14:55:20 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2016 14:55:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 310C2C0D25 for ; Tue, 1 Mar 2016 14:55:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.649 X-Spam-Level: X-Spam-Status: No, score=-1.649 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.329] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id bfJffYslXmgQ for ; Tue, 1 Mar 2016 14:55:18 +0000 (UTC) Received: from mail.tu-berlin.de (mail.tu-berlin.de [130.149.7.33]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 900855FBC0 for ; Tue, 1 Mar 2016 14:55:17 +0000 (UTC) X-tubIT-Incoming-IP: 130.149.6.144 Received: from ex-mbx-04.tubit.win.tu-berlin.de ([130.149.6.144] helo=exchange.tu-berlin.de) by mail.tu-berlin.de (exim-4.76/mailfrontend-5) with esmtp for id 1aalhi-0006yB-7r; Tue, 01 Mar 2016 15:55:16 +0100 Received: from [130.149.225.110] (130.149.225.110) by EX-MBX-04.tubit.win.tu-berlin.de (130.149.6.144) with Microsoft SMTP Server (TLS) id 15.0.1156.6; Tue, 1 Mar 2016 11:58:18 +0100 Subject: Re: Iterations problem in command line To: References: <56D074DA.5010904@tu-berlin.de> <56D47895.9080706@tu-berlin.de> From: Marcela Charfuelan Message-ID: <56D57606.9070600@tu-berlin.de> Date: Tue, 1 Mar 2016 11:59:18 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: EX-CAS02.tubit.win.tu-berlin.de (130.149.6.142) To EX-MBX-04.tubit.win.tu-berlin.de (130.149.6.144) X-PMWin-Version: 4.0.1, Antivirus-Engine: 3.63.0, Antivirus-Data: 5.24 X-PureMessage: [Scanned] Hi, the iteration looks like: DataSet gmms = getInitialGMMDataSet(env); IterativeDataSet loop = gmms.iterate(50); DataSet newGMMs = features.map(new Estep_ExpectationMaximisation()).withBroadcastSet(loop, "gmms") .reduceGroup(new Mstep_ExpectationMaximisation()).withBroadcastSet(loop, "gmms"); DataSet finalGMMs = loop.closeWith(newGMMs) in every iteration the gmms parameters should be updated, I have noticed that the first iteration is ok, but afterwards start to get wrong... at least in the command line (for example gmms.coeff for the three gmms here should not sum up more that 1) I have put the code here in case it helps: https://github.com/marcelach1/EmExercise Regards, Marcela. On 01.03.2016 10:58, Fabian Hueske wrote: > Yes, env.setParallelism(1) fixes the parallelism of all operators to 1 > (unless an operator overrides this setting). > Can you identify at which position in the data flow the results start to > diverge? > > Best, Fabian > > 2016-02-29 17:57 GMT+01:00 Marcela Charfuelan > >: > > Thanks Fabian, > I am using in both default options, since I am not testing in a > cluster yet, just local in ubuntu, I am not specifying any parallelism. > just to test I set in the program env.setParallelism(1) and running > with -p 1 (which I guess I would not need) but I am still getting > the same issue. > > Regards, > MArcela. > > > On 29.02.2016 16:44, Fabian Hueske wrote: > > Hi Marcela, > > do you run the algorithm in both setups with the same parallelism? > > Best, Fabian > > 2016-02-26 16:52 GMT+01:00 Marcela Charfuelan > > >>: > > Hello, > > I implemented an algorithm that includes iterations (EM > algorithm) > and I am getting different results when running in eclipse > (Luna > Release (4.4.0)) and when running in the command line using > Flink > run; the program does not crash is just that after the first > iteration the results are different (wrong in the command > line). > > The solution I am getting in eclipse, for each iteration, > is the > same that I would get if running the algorithm in octave for > example, so I am sure the solution is correct. > > I have tried using java plus the command-line arguments of > eclipse > on the command line and that also works ok (local in ubuntu). > > Has anybody experienced something similar? Any idea why > this could > happen? how can I fix this? > > Regards, > Marcela. > > > > -- Dr. Marcela Charfuelan, Senior Researcher TU Berlin, School of Electrical Engineering and Computer Sciences Database Systems and Information Management (DIMA) EN7, Einsteinufer 17, D-10587 Berlin Room: EN 725 Phone: +49 30-314-23556 URL: http://www.user.tu-berlin.de/charfuelan