cayenne-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrus Adamchik <and...@objectstyle.org>
Subject Re: Problem with catastrophic performance degradation under PostgreSQL
Date Tue, 21 Apr 2015 09:55:56 GMT
Done. The fix is 4.0-only [1] I optimized the ON(..) loop from O(N2) to O(N).

Andrus

[1] https://github.com/apache/cayenne/commit/27a99fab4fb06725cba3e71d2a05d9fad129b260


> On Apr 21, 2015, at 9:16 AM, Andrus Adamchik <andrus@objectstyle.org> wrote:
> 
> Looks pretty good to me. I'll run a quick test and commit.
> 
> Thanks,
> Andrus
> 
>> On Apr 20, 2015, at 8:39 PM, Øyvind Harboe <oyvind.harboe@zylin.com> wrote:
>> 
>> Hi Andrus,
>> 
>> here's my stab at it. I'm sure it's all sorts of wrong :-) but the
>> pathological performance drop on PostgreSQL is gone.
>> 
>> I actually tested this on Cayenne 3 and then copied it over to a patch
>> on origin/master.
>> 
>> 
>> ---------- Forwarded message ----------
>> From: Andrus Adamchik <andrus@objectstyle.org>
>> Date: Mon, Apr 20, 2015 at 6:07 PM
>> Subject: Re: Problem with catastrophic performance degradation under PostgreSQL
>> To: dev@cayenne.apache.org
>> 
>> 
>> Hi,
>> 
>> So, Cayenne adds DISTINCT whenever a qualifier includes a to-many
>> relationship (for obvious reasons). Since we are dealing with an
>> object query, I guess we always know the ID columns and can rewrite
>> DISTINCT for PostgreSQL as "DISTINCT ON(id1, id2, ...)" to achieve the
>> desired effect.
>> 
>> You want to take a shot at providing a fix for that?
>> 
>> Thanks,
>> Andrus
>> 
>> 
>>> On Apr 20, 2015, at 6:09 PM, Øyvind Harboe <oyvind.harboe@zylin.com> wrote:
>>> 
>>> Q: Would it be a good idea to modify the PostgreSQL Cayenne adapter to use
>>> the 'DISTINCT ON()' syntax?
>>> 
>>> I've been testing out PostgreSQL vs. Derby for our application when I ran
>>> into a problem where the performance of PostgreSQL went from great to
>>> abysmal for no apparent reason.
>>> 
>>> After a bit of digging, I've found that the problem is with the SQL
>>> statement that Cayenne generates.
>>> 
>>> Cayenne generates statements of the following form which yields bad
>>> performance on PostgreSQL with complicated WHERE statements and numerous
>>> columns:
>>> 
>>> 1) SELECT DISTINCT a,b,c,d,e,f ... WHERE somecomplicatedstatement
>>> 
>>> If I rewrite this statement to the form below using the 'DISTINCT ON()'
>>> syntax(which is PostgreSQL specific dialect), then I get great performance
>>> again:
>>> 
>>> 2) SELECT DISTINCT ON(a) a,b,c,d,e,f ... WHERE somecomplicatedstatement
>>> 
>>> 
>>> 
>>> 
>>> Numbers on my machine:
>>> 
>>> 1) 44000ms
>>> 
>>> 2) 4300ms
>>> 
>>> Here's where I read up on the DISTINCT ON syntax:
>>> http://www.postgresql.org/docs/9.4/static/sql-select.html
>>> 
>>> 
>>> 
>>> --
>>> Øyvind Harboe - Can Zylin Consulting help on your project?
>>> http://www.zylin.com/
>> 
>> 
>> 
>> 
>> -- 
>> Øyvind Harboe - Can Zylin Consulting help on your project?
>> http://www.zylin.com/
>> <0001-postgresql-improves-performance-by-taking-advantage-.patch>
> 
> 


Mime
View raw message