flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Nuyanzin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8255) Key expressions on named row types do not work
Date Sun, 29 Apr 2018 20:41:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16458179#comment-16458179
] 

Sergey Nuyanzin commented on FLINK-8255:
----------------------------------------

A little bit research shows that it is something related to class Hierarchy: RowTypeInfo,
TupleTypeInfoBase, TupleTypeInfo. Both RowTypeInfo and TupleTypeInfo are ancent of TupleTypeInfoBase.
At the same time in e.g. org.apache.flink.streaming.util.typeutils.FieldAccessorFactory there
are some checks with casting {code:java}	...
 else if (typeInfo.isTupleType()) {
			TupleTypeInfoBase tupleTypeInfo = (TupleTypeInfoBase) typeInfo;
...{code}
As RowTypeInfo and TupleTypeInfo are in parallel hierarchy branches => casting will fail
for RowTypeInfo. At the same time it looks like there is nothing special related to TupleTypeInfo
=> casting to TupleTypeInfoBase is enough. 
Based on finding usages of FieldAccessorFactory's methods with specified casting there could
be added 2 more test-case which are also fails with the similar ClastCastException
{code:java}
		final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

		TypeInformation[] types = new TypeInformation[]{Types.INT, Types.INT};

		String[] fieldNames = new String[]{"id", "value"};
		RowTypeInfo rowTypeInfo = new RowTypeInfo(types, fieldNames);

		UnsortedGrouping groupDs = env.fromCollection(Collections.singleton(new Row(2)), rowTypeInfo).groupBy(0);

		groupDs.maxBy(1);
{code}
and one more almost the same however with .minBy in the last line

one of the possible fixes: usage casting to TupleTypeInfoBase rther than to TupleTypeInfo
(I'm not sure that changing hierarchy could be an option).
such fix for the mentioned 3 cases is available here https://github.com/apache/flink/compare/master...snuyanzin:FLINK-8255_Key_expressions_on_named_row_types_do_not_work

at the same time it looks like there still could be issues, e.g.
org.apache.flink.api.java.DataSet#minBy
org.apache.flink.api.java.DataSet#maxBy
org.apache.flink.streaming.util.typeutils.FieldAccessor.RecursiveTupleFieldAccessor#RecursiveTupleFieldAccessor

also have such casting however at the moment I do not have any idea about test where it could
fail

> Key expressions on named row types do not work
> ----------------------------------------------
>
>                 Key: FLINK-8255
>                 URL: https://issues.apache.org/jira/browse/FLINK-8255
>             Project: Flink
>          Issue Type: Bug
>          Components: DataSet API, DataStream API
>    Affects Versions: 1.4.0, 1.5.0
>            Reporter: Timo Walther
>            Priority: Major
>
> The following program fails with a {{ClassCastException}}. It seems that key expressions
and rows are not tested well. We should add more tests for them.
> {code}
> final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
> TypeInformation[] types = new TypeInformation[] {Types.INT, Types.INT};
> String[] fieldNames = new String[]{"id", "value"};
> RowTypeInfo rowTypeInfo = new RowTypeInfo(types, fieldNames);
> env.fromCollection(Collections.singleton(new Row(2)), rowTypeInfo)
> .keyBy("id").sum("value").print();
> env.execute("Streaming WordCount");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message