cayenne-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Gentry <>
Subject Re: Performance Issues/Questions
Date Sat, 12 Dec 2009 14:38:19 GMT
On Sat, Dec 12, 2009 at 1:23 AM, Aristedes Maniatis <> wrote:
> On 12/12/09 3:54 PM, Michael Gentry wrote:
>> I created a brand new DC in each thread, got the system time, ran the
>> query (and possibly committed a change), then got the system time
>> again to compute the difference.
> OK. So you even though you aren't directly counting the time spent creating
> contexts, since your test is threaded you are implicitly counting that time.
> Say (I don't know) that one ORM locks around creating new contexts, then you
> are incorporating that into your timing results.

Good point.

> The problem with a test like this is that it doesn't always reproduce how
> this works in real life. Would you be creating lots of new contexts
> concurrently in your real application?

I'm well aware that the test isn't exactly realistic.  I was just a
little stunned at how it actually fared.  Most likely will not create
a lot of new contexts concurrently, but we were just trying to stress
it for a more worst-case spike type situation.

>> So the numbers are the time it takes
>> the thread to complete, for the most part (I kick them off at the same
>> time -- at least as closely as I can).
> You should be recording and reporting the start and end times of each thread
> in order to understand if locking is causing some to wait. If you graphed
> them on a little gantt style chart, then that would become apparent. If
> you've ever used Firebug's network analysis, you'll know what I mean. When
> did the thread start? When did you get a new context returned? How long did
> the fetch take? Then the commit?

I can make that change on Monday and try again.

>>  My guess is it is
>> locking/concurrency, but I didn't have time to delve into it any
>> deeper (was the end of the day).  When I ran RetrieveIndividual20 with
>> 5 threads instead of 20 the results were MUCH closer to the Hibernate
>> values (sometimes better).  (Keep in mind both ORMs had a limit of 5
>> open DB channels.)
> There are so many things going on here, it is hard to isolate the
> differences. Perhaps you need to fire up a profiler and see where it is
> spending most of its time. Is there a big impact going from equal or less
> than the number of db channels to one more?

Good idea.  I'll try that one, too.

>>> * There is so much variance between runs that something is odd here. What
>>> are you doing between runs to clear the cache? How are you dealing with
>>> variability at the database? Is the database running on another server
>>> which
>>> is unloaded and isn't itself interfering with the results?
>> I ran them several times and each time gave similar numbers.  The ones
>> in the spreadsheet are just from the last run I did, but they were all
>> similar from run-to-run.
> Ah, I thought these were the results of lots of runs. But you are showing
> the time from each thread in a single run. In that case the wild variance is
> definitely the result of come sort of locking. Can you compare CPU usage
> between the two ORMs? That would give you somewhat of an idea whether the
> time is spent in a lock or doing something.

I didn't think to watch the CPU, but I'll try that also on Monday.

>> The DB was on a different server, which is
>> also another reason I did several runs.  I expect a few variances, but
>> the general pattern was the same.  I don't have the QueryLogger
>> results here, but the DB times were significantly faster, so it really
>> was on the local Java side.
>>> * The huge variability makes me think that memory pressures might be
>>> important. Are you seeing the JVM garbage collect during the middle of
>>> some
>>> runs? Can you run jstat and see if that is interfering.
>> Didn't have time to get into GC.  The machine I was on had about 5 GB
>> of RAM free and I had allocated plenty to the JVM.  Also, fetching and
>> updating 20 records in the DB shouldn't be that memory intensive.
> That isn't really the point. No matter how much RAM you have Java may
> perform GC, depending on your test application, how much stack/heap you gave
> to the JVM, how long the application has been running, etc.

Well, this test application hadn't been running very long.  I did give
it more RAM than the default, but I can try tweaking it again.

>>> * Are you able to pick the worst case (update) and time the various
>>> parts:
>>> create context, database UPDATE, commit finished... In your tests are you
>>> timing creating the context or is that outside the timing loop?
>> I was creating the DC outside of the timing.
> Sort of, but not really. If you want to eliminate that from your
> measurement, you need to spawn 20 threads, have them all create contexts,
> fetch the data and then sleep. Then all at once you can wake them to perform
> an update concurrently and you'll be measuring only the update.

That's something else I can try, too.

> None of this is much use to tell your team how well Cayenne might work in
> your application in the real world since the tests don't really resemble how
> most ORMs are used. However if you are trying to isolate a specific thing to
> measure, it might be useful.

I completely understand the example isn't as real-world as desired.
We were just trying to get a quick gauge.  (Spent less than a day on
the Hibernate + Cayenne example combined.)  That one particular
example was so far off (over 10x) the Hibernate version that it was
disturbing, so I thought I'd ask for ideas (which you've given me --

I suspect that if instead of 20 concurrent threads I used 20
sequential threads the thing would run significantly quicker, but
that's something else to try on Monday.

FWIW, here is the pseudocode for the test:

UpdateIdea20's run() method (subclass of Thread):
    DataContext dc = DataContext.createDataContext();
    int start = System.currentTimeMillis();
    Expression exp = ExpressionFactory.match(key, value);
    SelectQuery query = new SelectQuery(Idea.class, exp);
    List<Idea> list = dc.performQuery(query);
    Idea idea = list.get(0);
    idea.setName(random string);
    int end = System.currentTimeMillis();
    println(end - start);

Thanks again!


View raw message