Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4BC5E200BA5 for ; Wed, 19 Oct 2016 10:59:52 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4A515160AEA; Wed, 19 Oct 2016 08:59:52 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 68FBB160ADE for ; Wed, 19 Oct 2016 10:59:51 +0200 (CEST) Received: (qmail 15425 invoked by uid 500); 19 Oct 2016 08:59:50 -0000 Mailing-List: contact dev-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list dev@flink.apache.org Received: (qmail 15405 invoked by uid 99); 19 Oct 2016 08:59:49 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Oct 2016 08:59:49 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id A14A8C05BF for ; Wed, 19 Oct 2016 08:59:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.8 X-Spam-Level: X-Spam-Status: No, score=-1.8 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-2.999, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=mailbox.org Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id r4zixuuXf3LA for ; Wed, 19 Oct 2016 08:59:44 +0000 (UTC) Received: from mx2.mailbox.org (mx2.mailbox.org [80.241.60.215]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 6C1FA5F24B for ; Wed, 19 Oct 2016 08:59:44 +0000 (UTC) Received: from smtp1.mailbox.org (smtp1.mailbox.org [80.241.60.240]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx2.mailbox.org (Postfix) with ESMTPS id 901F04377F for ; Wed, 19 Oct 2016 10:59:38 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mailbox.org; h= content-type:content-type:in-reply-to:mime-version:date:date :message-id:from:from:references:subject:subject:received; s= mail20150812; t=1476867568; bh=ztWaqGz8runLYjBkBjuB3fjMZCTcojQLf J3me9uednA=; b=qEBIIvwHvnt8lStQuquELpkLro2HpUFS0gOF1yQRyJYT0hKsx tt70CMVJZWau2Egiws+Jd5nOENl7TG/JZ/NbI/BqMWclYtLg98cMQd+y9Pg5enqk 71rTHs8PrfGQa729xjpRZa1xbWaUwjOnh2k26hUWSgdRLWrD2SjG2Z9cQ5FcPF70 TUbtILFBHHkkE1QHHrviazpHVJtttNJvXLiTk1JvKhCPmDosYAL5uiyjrJvBaQgb rY83QsUMMBIWZosWLopYXe+38R/q+ket8n7fQxty+XzqxmvyPz4zZxYxlC63i2zM FIwHu5z6POp3l0IXTIG0S6MWxtRKr8XqKCqmA== X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp1.mailbox.org ([80.241.60.240]) by gerste.heinlein-support.de (gerste.heinlein-support.de [91.198.250.173]) (amavisd-new, port 10030) with ESMTP id nZfpVUMIqX2G for ; Wed, 19 Oct 2016 10:59:28 +0200 (CEST) Subject: Re: Type erasure problem solely on cluster execution To: dev@flink.apache.org References: From: Martin Junghanns Message-ID: <3b6e6577-d819-82cf-6fe2-fd7f528533c1@mailbox.org> Date: Wed, 19 Oct 2016 10:59:27 +0200 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------5B8936E2463990F2AC22F849" archived-at: Wed, 19 Oct 2016 08:59:52 -0000 --------------5B8936E2463990F2AC22F849 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hi Fabian, Thank you for the quick reply and for looking into it. Sorry, I was a bit too quick with the field reference accusation. Turns out, my TypeInformation was wrong, hence the invalid reference exception. However, the type erasure problem still holds. The actual code can be found here [1]. The code runs fine using the LocalExecutionEnvironment and it also runs on the cluster when using a non-Pojo type for T (e.g. java.lang.Long). However, for Pojo types, it fails on the cluster with a type erasure related exception. Hence, I manually created the TypeInformation for the Embedding class: public static TypeInformation>getType(Class clazz) { TypeInformation type = TypeInformation.of(clazz); TypeInformation arrayType = ObjectArrayTypeInfo.getInfoFor(type); return new TupleTypeInfo<>(arrayType, arrayType); } and for the EmbeddingWithTiePoint class: public static TypeInformation>getType(Class clazz) { TypeInformation type = TypeInformation.of(clazz); TypeInformation> embeddingType = Embedding.getType(clazz); return new TupleTypeInfo<>(type, embeddingType); } Note, that this produces the same TypeInformation as the automatic type extraction does in the local, working scenario. I provided the type info to the UDF which initially creates the EmbeddingWithTiePoint instances [1]: DataSet> initialEmbeddings = vertices .filter(new ElementHasCandidate<>(traversalCode.getStep(0).getFrom())) .map(new BuildEmbeddingWithTiePoint<>(keyClass, traversalCode, vertexCount, edgeCount)) .returns(EmbeddingWithTiePoint.getType(keyClass)); However, Flink tells me that I now need to provide the same type information at all places where the output is of type EmbeddingWithTiePoint [2], [3]. If I do so, the program fails with a clast cast exception: Caused by: java.lang.ClassCastException: org.apache.flink.api.java.tuple.Tuple2 cannot be cast to org.gradoop.flink.model.impl.operators.matching.single.preserving.explorative.tuples.EmbeddingWithTiePoint at org.gradoop.flink.model.impl.operators.matching.single.preserving.explorative.functions.UpdateEdgeMappings.join(UpdateEdgeMappings.java:50) at org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashJoinIterator.callWithNextKey(NonReusingBuildFirstHashJoinIterator.java:149) at org.apache.flink.runtime.operators.JoinDriver.run(JoinDriver.java:222) at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:486) at org.apache.flink.runtime.iterative.task.AbstractIterativeTask.run(AbstractIterativeTask.java:146) at org.apache.flink.runtime.iterative.task.IterationIntermediateTask.run(IterationIntermediateTask.java:92) at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:351) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at java.lang.Thread.run(Thread.java:745) I guess, the issue is not really the missing TypeInformation, but something that is done differently when using the cluster execution and Pojo types. Maybe related to the generic array creation via reflection? Hope this helps. Best, Martin [1] https://github.com/dbs-leipzig/gradoop/blob/master/gradoop-flink/src/main/java/org/gradoop/flink/model/impl/operators/matching/single/preserving/explorative/DistributedTraverser.java#L170 [2] https://github.com/dbs-leipzig/gradoop/blob/master/gradoop-flink/src/main/java/org/gradoop/flink/model/impl/operators/matching/single/preserving/explorative/DistributedTraverser.java#L215 [3] https://github.com/dbs-leipzig/gradoop/blob/master/gradoop-flink/src/main/java/org/gradoop/flink/model/impl/operators/matching/single/preserving/explorative/DistributedTraverser.java#L234 On 19.10.2016 09:33, Fabian Hueske wrote: > Hi Martin, > > thanks for reporting the problem and providing code to reproduce it. > > Would you mind to describe the problem with the forwarding annotations in > more detail? > I would be interested in the error message and how the semantic annotation > is provided (@ForwardFields or withForwardedFields()). > > Thanks, Fabian > > 2016-10-19 8:52 GMT+02:00 Martin Junghanns : > >> Hi, >> >> I am running into a type erasure problem which only occurs when I execute >> the code using a Flink cluster (1.1.2). I created a Gist [1] which >> reproduces the problem. I also added a unit test to show that it does not >> fail in local and collection mode. >> >> Maybe it is also interesting to mention that - in my actual code - I >> manually created a TypeInformation (the same which is automatically created >> on local execution) and gave it to the operators using .returns(..). >> However, this lead to the issue, that my field forwarding annotations >> failed with invalid reference exceptions (the same annotations that work >> locally). >> >> The issue came up after I generalized the core of one our algorithms. >> Before, when the types were non-generic, this ran without problems locally >> and on the cluster. >> >> Thanks in advance! >> >> Cheers, Martin >> >> [1] https://gist.github.com/s1ck/caf9f3f46e7a5afe6f6a73c479948fec >> >> The exception in the Gist case: >> >> The return type of function 'withPojo(Problem.java:58)' could not be >> determined automatically, due to type erasure. You can give type >> information hints by using the returns(...) method on the result of the >> transformation call, or by letting your function implement the >> 'ResultTypeQueryable' interface. >> org.apache.flink.api.java.DataSet.getType(DataSet.java:178) >> org.apache.flink.api.java.DataSet.collect(DataSet.java:407) >> org.apache.flink.api.java.DataSet.print(DataSet.java:1605) >> Problem.withPojo(Problem.java:60) >> Problem.main(Problem.java:38) >> >> --------------5B8936E2463990F2AC22F849--