Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A256318633 for ; Tue, 22 Mar 2016 14:15:31 +0000 (UTC) Received: (qmail 63690 invoked by uid 500); 22 Mar 2016 14:15:31 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 63005 invoked by uid 500); 22 Mar 2016 14:15:30 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 62854 invoked by uid 99); 22 Mar 2016 14:15:28 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Mar 2016 14:15:28 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 719C21A1391 for ; Tue, 22 Mar 2016 14:15:28 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=googlemail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id ld8EeL8EHwHc for ; Tue, 22 Mar 2016 14:15:27 +0000 (UTC) Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id AC4125FB2D for ; Tue, 22 Mar 2016 14:15:26 +0000 (UTC) Received: by mail-wm0-f46.google.com with SMTP id l68so165808256wml.1 for ; Tue, 22 Mar 2016 07:15:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=from:message-id:mime-version:subject:date:references:to:in-reply-to; bh=isj5swfCOvoHGg8IhI5hu7OflQwy4hmLspjT6rxAcfY=; b=qRx4yVUeAZr+aeVEPcxNKpElLSwaZvvA1kpQvQcFpiBgD0aRCXTHXsHHeOQ3du87+K f+o3VYffBo/elNihIvcnCw42SFV82Y8joydhLDRxWujOd/xJoKACo1ec3R8qh+36lDix 97JcZaG4Oc9rAcSuGXtbu7xw4ogKi7qB5VVFcp43h6bs8G4yt9y6cF5UWppqaGD4Jdi3 vLbdCYRlFnnJRGaI1mYXu9BPKxr+fZSWQOxcB0fKciXMv9v/Y2CDc5rVFr/UMfnIYCJL 4zjKPJqJT4FLe/Inv1SYBagQMEZyGL2IlfVA3d7awSpGCyk8Vi/hjqofzw2+1HIh9iNO /o4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:message-id:mime-version:subject:date :references:to:in-reply-to; bh=isj5swfCOvoHGg8IhI5hu7OflQwy4hmLspjT6rxAcfY=; b=UdOiZsrYn/ibw3W4ln/Sj64wFc6lLCy5Dus0QM8pcaxYgh60Ue5g7Yw55p2WUogJcK TROtev5CSsB1Gs4zTfq9MzXU1KWhgplXE5Zptj385ROn6oxdLUL2mPm8KDdYssRsfyQi veLvQwiwmU/0frmeOBo7yJbsocSj5N8ARg3Lp472LyztlV7udTj/nUtK1jZUxEr4+IJD xNiwcyK7QOSpMeBM0gY744pb3WMFyocIIfpGnckmVW7bxABqPhSSxK0So5sekhduLhaI Lgda+UwkKEIx25azS9EZN5WU2aduNaIF4NKmCdLP4cz3q9l203FPVaY0SaDs+Hu93xMM qf0g== X-Gm-Message-State: AD7BkJJSGv4hGLrT43ctbxfoZyozwP6e6EWnTQwEgClTa55WxQNoL3bNP9IUWAQzIRmHOQ== X-Received: by 10.194.89.70 with SMTP id bm6mr41868924wjb.0.1458656126422; Tue, 22 Mar 2016 07:15:26 -0700 (PDT) Received: from [192.168.2.100] (dslb-088-075-114-165.088.075.pools.vodafone-ip.de. [88.75.114.165]) by smtp.googlemail.com with ESMTPSA id pd1sm30481477wjb.19.2016.03.22.07.15.24 for (version=TLSv1/SSLv3 cipher=OTHER); Tue, 22 Mar 2016 07:15:24 -0700 (PDT) From: Lydia Ickler Content-Type: multipart/alternative; boundary="Apple-Mail=_7E50FA54-40B4-4A46-B325-5DCC4E642F79" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) Subject: Re: normalizing DataSet with cross() Date: Tue, 22 Mar 2016 15:15:24 +0100 References: <38DEBC3D-1AE0-40F4-B8F1-B167C9FFE883@googlemail.com> To: user@flink.apache.org In-Reply-To: X-Mailer: Apple Mail (2.3112) --Apple-Mail=_7E50FA54-40B4-4A46-B325-5DCC4E642F79 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Till, maybe it is doing so because I rewrite the ds in the next step again and = then the working steps get mixed? I am reading the data from a local .csv file with readMatrix(env, = =E2=80=9Efilename") See code below. Best regards, Lydia //read input file DataSet> ds =3D readMatrix(env, input); /**************** POWER ITERATION *****************/ //get initial vector - which equals matrixA * [1, ... , 1] DataSet> initial =3D = ds(0).aggregate(Aggregations.SUM,2); //normalize by maximum value initial =3D initial.cross(initial.aggregate(Aggregations.MAX, = 2)).map(new normalizeByMax()); public static DataSource> = readMatrix(ExecutionEnvironment env, = String filePath) { CsvReader csvReader =3D env.readCsvFile(filePath); csvReader.fieldDelimiter(","); csvReader.includeFields("ttt"); return csvReader.types(Integer.class, Integer.class, Double.class); } > Am 22.03.2016 um 14:47 schrieb Till Rohrmann : >=20 > Hi Lydia, >=20 > I tried to reproduce your problem but I couldn't. Can it be that you = have somewhere a non deterministic operation in your program or do you = read the data from a source with varying data? Maybe you could send us a = compilable and complete program which reproduces your problem. >=20 > Cheers, > Till >=20 > On Tue, Mar 22, 2016 at 2:21 PM, Lydia Ickler > wrote: > Hi all, >=20 > I have a question. > If I have a DataSet DataSet> ds and I = want to normalize all values (at position 2) in it by the maximum of the = DataSet (ds.aggregate(Aggregations.MAX, 2)).=20 > How do I tackle that? >=20 > If I use the cross operator my result changes every time I run the = program (see code below) > Any suggestions? >=20 > Thanks in advance! > Lydia > ds.cross(ds.aggregate(Aggregations.MAX, 2)).map(new normalizeByMax()); > public static final class normalizeByMax implements > MapFunction, = Tuple3>, > Tuple3> { >=20 > public Tuple3 map( > Tuple2, Tuple3> value) > throws Exception { > return new Tuple3(value.f0.f0,value.f0.f1,value.f0.f2/value.f1.f2); > } > } >=20 >=20 >=20 --Apple-Mail=_7E50FA54-40B4-4A46-B325-5DCC4E642F79 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Hi Till,

maybe it is doing so because I rewrite the ds in the next = step again and then the working steps get mixed?
I = am reading the data from a local .csv file with readMatrix(env, = =E2=80=9Efilename")

See code below.

Best regards,
Lydia

//read input file
DataSet<Tuple3<Integer, Integer, Double>> = ds =3D readMatrix(env, input);

/****************
= POWER ITERATION
*****************/

//get initial vector - which = equals matrixA * [1, ... , 1]
DataSet<Tuple3<Integer, Integer, Double>> = initial =3D ds(0).aggregate(Aggregations.SUM,2);

//normalize by = maximum value
initial =3D = initial.cross(initial.aggregate(Aggregations.MAX, 2)).map(new normalizeByMax());
public static =
DataSource<Tuple3<Integer, Integer, Double>> =
readMatrix(ExecutionEnvironment env,
= String filePath) {
CsvReader csvReader =3D env.readCsvFile(filePath);
csvReader.fieldDelimiter(",");
= csvReader.includeFields("ttt");
return = csvReader.types(Integer.class, Integer.class, = Double.class);
}

Am 22.03.2016 um 14:47 schrieb Till Rohrmann = <trohrmann@apache.org>:

Hi Lydia,

I= tried to reproduce your problem but I couldn't. Can it be that you have = somewhere a non deterministic operation in your program or do you read = the data from a source with varying data? Maybe you could send us a = compilable and complete program which reproduces your problem.

Cheers,
Till

On Tue, Mar 22, 2016 at 2:21 PM, Lydia Ickler = <icklerly@googlemail.com> wrote:
Hi all,

I have a question.
If I have a DataSet DataSet<Tuple3<Integer, Integer, Double>> = ds and I want to normalize all values (at position 2) in it = by the maximum of the DataSet (ds.aggregate(Aggregations.MAX,= 2)). 
How do I tackle = that?

If I use = the cross operator my result changes every time I run the program (see = code below)
Any suggestions?
Thanks in advance!
Lydia
ds.cross(ds.aggregate(Aggregations.MAX, 2)).map(new normalizeByMax());
public static final =
class normalizeByMax implements
= MapFunction<Tuple2<Tuple3<Integer, Integer, Double>, = Tuple3<Integer, Integer, Double>>,
= Tuple3<Integer, Integer, Double>> {

public Tuple3<Integer, Integer, Double> map(
Tuple2<Tuple3<Integer, Integer, Double>, = Tuple3<Integer, Integer, Double>> value)
= throws = Exception {
return new = Tuple3<Integer, Integer, Double>(value.f0.f0,value.f0.f1,value.f0.f2/value.f1.f2);
}
}




= --Apple-Mail=_7E50FA54-40B4-4A46-B325-5DCC4E642F79--