cayenne-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrus Adamchik <and...@objectstyle.org>
Subject Re: Problem with catastrophic performance degradation under PostgreSQL
Date Tue, 21 Apr 2015 06:16:24 GMT
Looks pretty good to me. I'll run a quick test and commit.

Thanks,
Andrus

> On Apr 20, 2015, at 8:39 PM, Øyvind Harboe <oyvind.harboe@zylin.com> wrote:
> 
> Hi Andrus,
> 
> here's my stab at it. I'm sure it's all sorts of wrong :-) but the
> pathological performance drop on PostgreSQL is gone.
> 
> I actually tested this on Cayenne 3 and then copied it over to a patch
> on origin/master.
> 
> 
> ---------- Forwarded message ----------
> From: Andrus Adamchik <andrus@objectstyle.org>
> Date: Mon, Apr 20, 2015 at 6:07 PM
> Subject: Re: Problem with catastrophic performance degradation under PostgreSQL
> To: dev@cayenne.apache.org
> 
> 
> Hi,
> 
> So, Cayenne adds DISTINCT whenever a qualifier includes a to-many
> relationship (for obvious reasons). Since we are dealing with an
> object query, I guess we always know the ID columns and can rewrite
> DISTINCT for PostgreSQL as "DISTINCT ON(id1, id2, ...)" to achieve the
> desired effect.
> 
> You want to take a shot at providing a fix for that?
> 
> Thanks,
> Andrus
> 
> 
>> On Apr 20, 2015, at 6:09 PM, Øyvind Harboe <oyvind.harboe@zylin.com> wrote:
>> 
>> Q: Would it be a good idea to modify the PostgreSQL Cayenne adapter to use
>> the 'DISTINCT ON()' syntax?
>> 
>> I've been testing out PostgreSQL vs. Derby for our application when I ran
>> into a problem where the performance of PostgreSQL went from great to
>> abysmal for no apparent reason.
>> 
>> After a bit of digging, I've found that the problem is with the SQL
>> statement that Cayenne generates.
>> 
>> Cayenne generates statements of the following form which yields bad
>> performance on PostgreSQL with complicated WHERE statements and numerous
>> columns:
>> 
>> 1) SELECT DISTINCT a,b,c,d,e,f ... WHERE somecomplicatedstatement
>> 
>> If I rewrite this statement to the form below using the 'DISTINCT ON()'
>> syntax(which is PostgreSQL specific dialect), then I get great performance
>> again:
>> 
>> 2) SELECT DISTINCT ON(a) a,b,c,d,e,f ... WHERE somecomplicatedstatement
>> 
>> 
>> 
>> 
>> Numbers on my machine:
>> 
>> 1) 44000ms
>> 
>> 2) 4300ms
>> 
>> Here's where I read up on the DISTINCT ON syntax:
>> http://www.postgresql.org/docs/9.4/static/sql-select.html
>> 
>> 
>> 
>> --
>> Øyvind Harboe - Can Zylin Consulting help on your project?
>> http://www.zylin.com/
> 
> 
> 
> 
> -- 
> Øyvind Harboe - Can Zylin Consulting help on your project?
> http://www.zylin.com/
> <0001-postgresql-improves-performance-by-taking-advantage-.patch>


Mime
View raw message