cayenne-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrus Adamchik <and...@objectstyle.org>
Subject 4.2 - DISTINCT in column query
Date Sat, 18 Apr 2020 08:23:28 GMT
TL;DR: Cayenne using DISTINCT for column queries in 4.2 affects Agrest algorithms, but is nevertheless
correct and should stay around.

----
When testing M1, I noticed a difference in ordering of the second-level results in Agrest
vs 4.0/4.1. Those are generated in Agrest via a SelectQuery with columns roughly looking like
this:

  List<Property<?>> properties = new ArrayList<>();
  properties.add(Property.createSelf(E3.class));

  Expression exp = ExpressionFactory.dbPathExp("e2.id");
  properties.add(Property.create(exp, Integer.class));

  SelectQuery query = new SelectQuery(E3.class);
  query.setColumns(properties);

Here "e2" is a to-one relationship. The difference in generated SQL between 4.0 and 4.2 is
the DISTINCT keyword added in the latter:

4.0: 
  SELECT t0.name, t0.e2_id, t0.id_, t1.id_ FROM utest.e3 t0 JOIN utest.e2 t1 ON (t0.e2_id
= t1.id_)

4.2:
  SELECT DISTINCT t0.name, t0.e2_id, t0.id_, t1.id_ FROM utest.e3 t0 JOIN utest.e2 t1 ON t0.e2_id
= t1.id_ 

Both produce the correct result, but since there's no explicit ordering, the actual order
of the objects returned by Derby is different. I am less worried about the ordering (this
was just an indicator to me that something has changed), but DISTINCT has a performance impact,
and now it seems it will affect the main execution path of Agrest. 

I suppose Cayenne behavior in 4.2 is correct as with column queries there are no simple rules
for when there may be duplicate result rows. Our case doesn't require DISTINCT only because
of the special combination (entity and a related to-one id). And we need to fix this on Agrest
end (that join is redundant in case of to-one). 

Still figured I'd mention...

Andrus



Mime
View raw message