flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ufuk Celebi (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (FLINK-2542) It should be documented that it is required from a join key to override hashCode(), when it is not a POJO
Date Tue, 25 Aug 2015 14:56:45 GMT

     [ https://issues.apache.org/jira/browse/FLINK-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ufuk Celebi resolved FLINK-2542.
--------------------------------
    Resolution: Won't Fix

I think it's OK to assume that people follow the general Object contract:
{code}
Note that it is generally necessary to override the {@code hashCode} method whenever this
method is overridden, so as to maintain the general contract for the {@code hashCode} method,
which states that equal objects must have equal hash codes.
{code}

If more people run into this, we can revisit this issue.

> It should be documented that it is required from a join key to override hashCode(), when
it is not a POJO
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-2542
>                 URL: https://issues.apache.org/jira/browse/FLINK-2542
>             Project: Flink
>          Issue Type: Bug
>          Components: Gelly, Java API
>            Reporter: Gabor Gevay
>            Priority: Minor
>             Fix For: 0.10, 0.9.1
>
>
> If the join key is not a POJO, and does not override hashCode, then the join silently
fails (produces empty output). I don't see this documented anywhere.
> The Gelly documentation should also have this info separately, because it does joins
internally on the vertex IDs, but the user might not know this, or might not look at the join
documentation when using Gelly.
> Here is an example code:
> {noformat}
> public static class ID implements Comparable<ID> {
> 	public long foo;
> 	//no default ctor --> not a POJO
> 	public ID(long foo) {
> 		this.foo = foo;
> 	}
> 	@Override
> 	public int compareTo(ID o) {
> 		return ((Long)foo).compareTo(o.foo);
> 	}
> 	@Override
> 	public boolean equals(Object o0) {
> 		if(o0 instanceof ID) {
> 			ID o = (ID)o0;
> 			return foo == o.foo;
> 		} else {
> 			return false;
> 		}
> 	}
> 	@Override
> 	public int hashCode() {
> 		return 42;
> 	}
> }
> public static void main(String[] args) throws Exception {
> 	ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
> 	DataSet<Tuple2<ID, Long>> inDegrees = env.fromElements(Tuple2.of(new ID(123l),
4l));
> 	DataSet<Tuple2<ID, Long>> outDegrees = env.fromElements(Tuple2.of(new ID(123l),
5l));
> 	DataSet<Tuple3<ID, Long, Long>> degrees = inDegrees.join(outDegrees, JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0).equalTo(0)
> 			.with(new FlatJoinFunction<Tuple2<ID, Long>, Tuple2<ID, Long>, Tuple3<ID,
Long, Long>>() {
> 				@Override
> 				public void join(Tuple2<ID, Long> first, Tuple2<ID, Long> second, Collector<Tuple3<ID,
Long, Long>> out) {
> 					out.collect(new Tuple3<ID, Long, Long>(first.f0, first.f1, second.f1));
> 				}
> 			}).withForwardedFieldsFirst("f0;f1").withForwardedFieldsSecond("f1");
> 	System.out.println("degrees count: " + degrees.count());
> }
> {noformat}
> This prints 1, but if I comment out the hashCode, it prints 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message