cayenne-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Zeigler <robert.zeig...@roxanemy.com>
Subject Re: Cayenne vs Hibernate Comparison
Date Mon, 06 Sep 2010 18:02:25 GMT

On Sep 6, 2010, at 9/611:16 AM , Joe Baldwin wrote:

> Robert,
> 
> All I can say is "wow", thanks for the insights.  This is especially important because
you use both frameworks.
> 
> Please let me ask some more questions.  (Note: as I said, I was initially attracted to
Cayenne because it had familiar design patterns to EOF, which I thought was fairly mature
at the time, so I may not understand the Hibernate-way of "thinking").
> 

I never used EOF, and I started using hibernate as a client requirement a couple years back,
coming to it from a "cayenne" perspective, so be aware there's some bias here. :)

> RE ObjectContext vs Session
> I may be mixed up but it sounds like the ObjectContext is similar in concept to EOF.
 It sounds like you are saying that among other things the Hibernate-Session makes simple
transactional tasks much more difficult and may even interfere with a Factory-Method approach
to building data objects within a transaction.
> 

According to my (very limited) knowledge of EOF, ObjectContext is similar. :)  Session is
somewhat similar.  This is from the Session api docs:

A typical transaction should use the following idiom:

 Session sess = factory.openSession();
 Transaction tx;
 try {
     tx = sess.beginTransaction();
     //do some work
     ...
     tx.commit();
 }
 catch (Exception e) {
     if (tx!=null) tx.rollback();
     throw e;
 }
 finally {
     sess.close();
 }

So they are demonstrating the use of a transaction to perform multiple units of work.  Of
note, they don't mention here that you would normally either call: session.flush() (before
tx.commit) or else your session would be set to "AUTO_FLUSH".  Basically, that means that
your session tries to find the difference between its managed object graph and the underlying
datastore and generate the appropriate SQL statements (updates, inserts, etc.). 

So, how is this different than cayenne? The biggest different comes in managing /new/ objects.
 There isn't a lot of difference in managing /existing/ database objects (except that I run
into more problems faulting relationships, see below).  Expanding the "//do some work" line
above, consider the following scenarios:

A a = new A();
B b = new B();
b.setA(a);//no exception here...
//IF: the relationship from B to A specifies cascade="save-update" (or equivalent):
//THEN: At least two insert statements generated at this point, one for a (first), then one
for b.
//IF: the cascade isn't set
//THEN: this line will result in an exception b/c A isn't committed.
sess.save(b);

A a = new A();
B b = new B();
sess.save(b);
tx.commit();

B will be saved.  A will be gc'ed.

Suppose we now have a saved B instance.

B b = getBFromSomewhere();

b.setSomeProperty("C");//no sql yet
b.setSomeProperty2("B");//no sql yet
b.setA(new A());// no sql yet. cascade="save-update" and the flush will be ok (A SQL generated);
with no cascade, we'll get an exception during the flush.
sess.flush();//UPDATE sql event generated.
tx.commit();//commit to the database. Note that the distinction between flush and commit really
only matters on databases that support true transactions.

It's worth noting that in most of the code that I write on a daily basis, the boiler-plate
code is abstracted away and I never deal with it, dealing instead with DAO objects that hide
all of the session and transactions.
I'm personally less likely to use a DAO in my cayenne code because the ObjectContext is generally
a simpler paradigm, at least for me.  It's fairly simple to think of it as an in-memory transaction,
that validates your objects (at commitChanges()) before generating /any/ SQL. :) 


> RE Lazy Fetching
> If I have the concept correct, this is another term for what EOF calls "faulting" behavior.
 IMO optimized/smart faulting behavior is the single most important reason to use an ORM.
 The conceptual differences between a RDBMS and an OO language can result in massive problems
with a macro design, one that a good ORM solves via intelligent faulting algorithms. I had
just assumed that this was a "moot" issue based on the fundamental solutions offered by EOF.
 Are there any essential differences in features of "Lazy Fetching" that you can point out?
> 

Yes.  Lazy Fetching in hibernate is... time sensitive.  You cannot fault a relationship if
the session that loaded the parent object has been closed in the meantime.  You cannot fault
a relationship in the middle of a lifecycle/event listener (pre/post update/save/delete, etc.).
 Doing so will result in the dreaded LazyInitializationException. :)  Hibernate uses proxies
in places of Cayenne's "fault" objects.  So you get a collection with a list of proxies rather
than a collection with a list of faults.  In theory, it's the same idea.  In practice, there's
a big difference.  Namely that you can easily tell, in cayenne, if you're dealing with a fault
vs. an "inflated" object.  More appropriately, you never actually access the fault.  By the
time your code access an object, the faulting has occurred.  Hibernate defers faulting until
you call a method on a proxy.  It seems like a smart idea (don't fault the object unless you
really, really, /really/ need it), but the problem is "value replacement" and equivalence.
 Let me give an example, based on the above code about sessions:

Suppose we have b:

B b = getBFromSomewhere();


B has a relationship to A.  In cayenne-world, that means that the underlying property, a,
in object b, contains a fault.  In the Hibernate world, the underlying property, a, in object
b, contains a proxy, we'll call it a'.  So for, no real difference.

A a = b.getA();//Cayenne resolves the fault here, hibernate does NOT!

So in Cayenne, at this point, a is really a.  In Hibernate, a is still a proxy, ie, a is a'.

a.getSomeValue();//hibernate faults here; cayenne is already faulted.

So now a in cayenne is still a.  And a in hibernate is STILL a proxy (a')! Because hibernate
can't replace the proxy reference a (a'), with the "real" value of a.  So instead, it replaces
the /internal/ "fault" object (niternal to the proxy) with the real a, and calls getSomeValue
on it, and returns it.  As I mentioned before, this is /usually/ fine, but there is a distinct
issue with inheritance doing things the Hibernate way, which is when C extends A, and a is,
in fact, an instance of C.  The proxy will ONLY be an instanceof A, even though comparisons
to the class (a.getClass()) will return the class object from the underlying object (an instance
of C), so you get the retarded behavior I mentioned in my first e-mail, where checking, eg,
C.class.isInstance(a) fails.  

> RE dbentity/objentity differences
> 
> Is the reason associated with the maturity of the "faulting" behavior (or something else)?
> 

I'm not sure, really.  I think it's more a difference in philosophy.  I have not interacted
much on the hibernate mailing list.  I've read a lot of hibernate source code and comments
(during various debugging sessions) and I've read a lot of mailing list discussions trying
to find answers to problems, but I've never interacted directly with the developers of hibernate.
 That said, my impression of the philosophy and attitude is that "ORM is about the object
layer.  Leave the db stuff to us.  Really. Trust us." :)  

> RE Google "hibernate lazyinitializationexception" to see what I mean.
> 
> OK, I googled it as you suggested, and found a few (what I call dissertations) on the
subject that suggest that Hibernate does not have a cogent "faulting" design, and that the
Hibernate-Session is not as mature a model for transactions as is Cayenne's ObjectContext
(if I understand the issues).  Is this correct?
> 
> Correct me if I am wrong (please :) ), but it is starting to sound similar to the discussions
of C++ garbage collection vs Java garbage collection (i.e. C++ doesn't really embrace garbage
collection as a problem that should be handled by anyone but the programmer)
> 

That's how I would call it. 
This is a rather insightful thread (note that Gavin is the lead hibernate developer).  It's
from 2003, but the philosophy/mindset arguably still exists in the hibernate project and its
apis:

https://forums.hibernate.org/viewtopic.php?f=1&t=42&start=0

In particular, there's this post by Gavin:

"NO!! This is very evil functionality! It is essential to the performance and transaction
isolation characteristics of Hibernate that the beginning and end of a session (ie. a connection
to the database) is demarcated by the developer! 

An auto-reconnect feature might look superficially appealing but it will reult in a system
that performs like a dog and has doubtful transaction semantics."

And basically, that's it in a nutshell.  The thread is discussing fetching a lazy relationship
after the session has closed.  If that were the /only/ "fringe case" of lazy relationship
navigation in hibernate, it would probably be tolerable.  But it turns out, it's /not/ the
only fringe case.  I constantly encounter what I would call "rough edges" around Hibernate's
lazy fetching.

HTH,

Robert

> Thanks again for your input,
> Joe
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Sep 6, 2010, at 11:06 AM, Robert Zeigler wrote:
> 
>> Hi Joe,
>> 
>> First, this e-mail wound up a lot longer than I intended, apologies! The short version
is: having used both a fair bit, I prefer cayenne, but they both have strengths and weaknesses.
Most of this e-mail details what I view as weaknesses in Hibernate. :)
>> 
>> On to the long version! 
>> 
>> I still know cayenne better than hibernate, but I've used both extensively (2+ years
of experience with hibernate, on a fairly large system with > 80 tables; I've used cayenne
for > 5 years now).  I don't have time right now to put together a systematic comparison,
but here are a few notes:
>> 
>> Hibernate: POJO - some people love it, some hate it. I'm in the latter camp.  You
lose out on a lot of code-reuse, debugging is more difficult, and it necessitates constructs
like hibernate proxies, which are a PITA to deal with, IMO.
>> Cayenne: interface (inheritance, in practice)-based design.  Some people don't like
the domain-structure constraints it imposes.  I find it makes debugging easier and results
in more code re-use. And, no proxies.  Objects are what they are.  More importantly, objects
are what you think they are (in hibernate-land, I've had code like the following:
>> 
>> public class MyObject2 extends MyObject1 {...}
>> 
>> elsewhere:
>> 
>> MyObject1 someobj = someCodeThatReturnsMyObject1();
>> if (MyObject2.class.isInstance(someobj)) {
>> ((MyObject2)someobj).callMyObject2Method();
>> }
>> 
>> And the above code will throw a cast class exception.  Yup.  That's right.  Because
someobj is a proxy.  A call to getClass() returns the getClass() of the underlying object
(an instance of MyObject2), but someobj, the proxy, technically only implements MyObject1.
 So you get a ClassCastException.  Other variants include your code not executing at all,
even when you know that someobj /should/ be an instance of MyObject2. The code above is, of
course, contrived, but I've hit this in numerous "real world" scenarios.
>> 
>> Hibernate & Cayenne both support "lazy fetching" in some form, but Cayenne's
support is far superior, IMO (with bonafide faulting, etc.).  This is a function, I think,
of having supported it far longer (this was initially the reason that I used Cayenne rather
than Hibernate).  Google "hibernate lazyinitializationexception" to see what I mean.  In particular,
if you go the hibernate route, be very /very/ careful what you do in event listeners (pre/post
commit, etc.) because it's very easy to hit exceptions there.  Basically, I find that in Hibernate,
event listeners are just barely better than useless.  A great idea, but you can't really /do/
anything useful in them.  And even the 3rd party modules, written by long-time hibernate developers,
hit these edge cases (eg: hibernate envers for entity auditing/logging has had issues reported
where they hit issues with lazyinitializationexception from using lifecycle listeners).  The
hibernate "way" was originally to not support lazy fetching at all; either all of the data
you needed came into the request/session at once, or it wasn't there at all.  This probably
resulted in more /performant/ code (fewer queriest/hits to the db, for instance), but is basically
not feasible in the world of 3rd party modules/integration/development: it's not always possible
to know exactly what information you need at the beginning of, eg, a web request.
>> 
>> I have to give kudos to the hibernate team for the extremely flexible mappings they
support.  I find cayenne mapping more /intuitive/ (thanks in part to the modeler), but there
are edge mapping cases that are supported in hibernate that are not, to the best of my knowledge,
supported in cayenne (cayenne 3.0 improves this discrepancy, though).  As an example, hibernate
supports more inheritance modeling schemes (table per concrete subclass, table per class,
single table) than does cayenne, although cayenne 3.0 has improved in this regard.  For simple
mapping, hibernate may even be more straightforward than cayenne due to it's ability to analyze
your domain objects and figure out the appropriate tables, etc. to create.  On the other hand,
I personally shy away from having hibernate auto-create my table structure.  I find it results
in less thinking about what's really occurring at the db level.  Although that is, to a greater
or lesser extent, the point of an ORM system, it's my opinion that it's still important to
think about how the data is physically mapped at the db level.  (I should note that you can
specify the exact mapping characteristics in hibernate.  But my observation is that the tendency
is to let hibernate "do it's thing" until you find a problem with the way it did it's thing,
and you tell hibernate the "right way" to do it). 
>> 
>> Metadata: There's no "dbentity" vs. "objentity" separation.  That's great for some
people... but really too bad. :) My personal experience is that cayenne's meta-data support
is more accessible and richer than Hibernate's, but that's probably a function, at least in
part, of familiarity with the frameworks.  
>> 
>> pks: Cayenne's approach is: "these are a database-artifact and shouldn't pollute
your data model, unless you need them to be there".  Hibernate's approach is: "pk's are an
integral part of your domain object" (for the most part).
>> 
>> ObjectContext vs. Session.  Session is a poor man's ObjectContext. ;) That's an opinion,
of course.  But. On the surface, these two objects do similar sorts of things: save, commit
transactions, etc.  But in reality, they are completely different paradigms.  In Cayenne,
an ObjectContext is very much a "sandbox" where you can make changes, roll them back, commit
them, etc.  A hibernate session is more like a command queue: you instruct it to update, save,
or delete specific objects, ask it for "Criteria" for criteria-based queries, etc.  They may
sound similar but there's a big difference in how you use them.  Basically, hibernate doesn't
have the notion of "modified or new object that needs to be saved at some point in the future,
but which I should retain a reference to now." :) In cayenne, you can do something like this:
>> 
>> void someMethod(ObjectContext context) {
>>  context.newObject(SomePersistentObject.class).setSomeProperty("foo");
>>  ...
>> }
>> 
>> Now when that particular context is committed, a new instance of SomePersistentObject
will be committed, without the calling code having to know about it.  Arguably, this is a
method witih "side effects" that should be avoided, but there are legitimate use cases for
this.  Consider a recent example I encountered.  A hibernate project I work on manages a set
of "projects".  Changes to projects are audited, except when the project is first created/in
state "project_created" (a custom flag, unrelated to hibernate).  I recently needed to add
support for one auditing operation: record the date of creation, and the user who created
the project.  WIthout getting into gory details, the simplest way to do this would have been
to modify the service responsible for creating all project types, along the lines of this
(how I would do this in cayenne):
>> 
>> public <T extends Project> T createProject(Class<T> type) {
>>  T project = codeToCreateProject();
>>  Audit a = objectContext.newObject(Audit.class);
>>  a.setProject(project);
>>  a.setMessage("Project Created");
>>  a.setDate(new Date());
>>  return project;
>> }
>> 
>> Notes: the project creator is not (and cannot, due to design constraints) commit
the project to the database at this point in the code.  That's fine in cayenne: as long as
the calling code is using the same object context (it always would be in my case), the Audit
object would be committed at the same time the project is, and life would be happy.  But the
project is not cayenne. It is hibernate.  So:
>> 
>> public <T extends Project> T createProject(Class<T> type) {
>>  T project = codeToCreateProject();
>>  Audit a = new Audit();
>>  a.setProject(project);
>>  a.setMessage("Project Created");
>>  a.setDate(new Date());
>>  return project;
>> }
>> 
>> Except, what happens to a? The answer is: nothing.  It isn't ever saved.  It would
be, if Project had a reverse relationship to audit (List<Audit> getAudits()), that was
set to cascade the "save" and "update" operations.
>> But Project didn't/doesn't, and I wasn't allowed to add it.  There was no way to
tell hibernate: "Look, I've got this object, and I wan't you to save it, but, not right this
second".  You can call: session.save(a).  But that results in an immediate commit the audit
object (and ONLY the audit object!), so if the project isn't yet persisted to the db, you
get a relationship constraint violation, trying to save a relationship to an unsaved object.
 There's also a session.persist(a) method, part of EJB3 spec, which is theoretically like
cayenne's "register", but in hibernate, its functionally equivalent (or very nearly so) to
session.save(a): it triggers an immediate commit to the database (at least in our application
setup).  There is no equivalent to cayenne's "context.register(a)".  I finally solved this
issue via life cycle event listeners, and it was a pain (you have to be /extremely/ careful
about what you do in hibernate event listeners.  In particular, read operations that result
in a hit to the database will cause you major grief, even if you don't modify anything, and
modification of any kind is next to impossible).  
>> 
>> All that said, there are /some/ good ideas in hibernate. :)  For one thing, Cayenne's
/requirement/ that two objects with a shared relationship be in the same ObjectContext can
cause grief, particularly in web applications.  Imagine you have a form to create a new object
of type Foo.  Foo has a relationship to Bar.  You may not want to register this object with
the context until you know that the new Foo object is a "valid" object (lest you wind up with
"dirty" objects polluting subsequent commits, using an ObjectContext-per-user session paradigm).
 But you can't do that: when you set the Bar relationship, Foo will be registered with the
context.  That's usually fine... you can usually rollback the changes... but it does mean
sometimes having to think carefully about what "state" your objects are in.
>> 
>> I've yet to find the "perfect" ORM.  THere isn't one, as far as I'm concerned, b/c
there's simply a mismatch between the db model and the object model that will result in tradeoffs.
 But I find Cayenne far easier to learn and use than Hibernate.
>> 
>> Cheers,
>> 
>> Robert
>> 
>> On Sep 5, 2010, at 9/51:21 PM , Joe Baldwin wrote:
>> 
>>> Hi,
>>> 
>>> I am again responsible for making a cogent Cayenne vs Hibernate Comparison. 
Before I "reinvent the wheel" so-to speak with a new evaluation, I would like to find out
if anyone has done a recent and fair comparison/evaluation (and has published it).
>>> 
>>> When I initially performed my evaluation of the two, it seemed like a very easy
decision.  While Hibernate had been widely adopted (and was on a number of job listings),
it seemed like the core decision was made mostly because "everyone else was using it" (which
I thought was a bit thin).
>>> 
>>> I base my decision on the fact that Cayenne (at the time) supported enough of
the core ORM features that I needed, in addition to being very similar conceptually to NeXT
EOF (which was the first stable Enterprise-ready ORM implementations).  Cayenne seems to support
a more "agile" development model, while being as (or more) mature than EOF.  (In my opinion.
:) )
>>> 
>>> It seem like there is an explosion of standards, which appear to be driven by
"camps" of opinions on the best practices for accomplishing abstraction of persistence supporting
both native apps and highly distributed SOA's.
>>> 
>>> My vote is obviously for Cayenne, but I would definitely like to update my understanding
of the comparison.
>>> 
>>> Thanks,
>>> Joe
>>> 
>> 
> 


Mime
View raw message