openjpa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Dick" <michael.d.d...@gmail.com>
Subject Re: datacache
Date Fri, 30 May 2008 19:02:07 GMT
Since this has changed the default behavior would it be better to put the
flag in the Compatibility class? If the default were true (original
behavior) I think it'd be fair game to put it in the Configuration class. I
like the idea of isolating changes from release to release in a single
place.

I don't have strong feelings in this regard, but I'm curious as to what the
rest of the community thinks. Maybe I'm overusing the Compatibily object ;-)

-Mike

On Fri, May 30, 2008 at 1:34 PM, Pinaki Poddar <ppoddar@apache.org> wrote:

>
> 1. Added a boolean (dynamic) property openjpa.RefreshFromDataCache. It
> defaults to false. But if set to true, DataCache will behaves as it were
> prior to the changes being discussed in this thread.
>
>
>
> Patrick Linskey-2 wrote:
> >
> >>> But, Patrick, you lead a wider discussion on the implicit design
> >>> assumption: "the data cache is always correct with respect to the
> >>> database".
> >>> Going forward, with pessimistic transaction being part of the spec,
> >>> this
> >>> view of DataCache being the 'in-memory replica of database' will be
> >>> harder
> >>> to comply and perhaps requires a revisit.
> >
> > Bear in mind that OpenJPA bypasses the cache when using pessimistic
> > locking.
> >
> >>> 2. Inconsistent in-memory model:
> >
> > As I've mentioned before, I believe that any such inconsistencies
> > should be treated as bugs, not design features.
> >
> >>> Can we analyze the data in DataCache to detect such data that are
> >>> prone to
> >>> inconsistencies and hence leave them from being cached?  perhaps yes.
> >
> > We can definitely detect such issues.
> >
> >>> Does it worth it? no, because assuming that data is mostly
> >>> consistent, it
> >>> will penalize the consistent model to save some careless programmer
> >>> who
> >>> could not keep their model consistent.  Also InverseManager already
> >>> can do
> >>> that at L1 cache level, for some inconsistencies.
> >
> > I do not believe that w can deem up-front and in general that such
> > inconsistencies are not worth resolving. I believe that we must look
> > at these bugs individually when determining how to resolve things.
> >
> >>> 3. retiring the 'optimization' on refresh
> >>> there were two refreshInternal() methods, one with a single
> >>> instance and
> >>> other with a collection. They had very similar code except a
> >>> boolean flag
> >>> that impacted behavior of a single instance. The collection form used
> >>> loadAll() for better performance and I kept that. Maintaining very
> >>> similar
> >>> (but not the same) code for a single instance versus a collection of
> >>> instance is risky.
> >
> > What measurements, if any, did you perform along with the change? What
> > is the performance impact of calling the plural method in the single-
> > instance case?
> >
> > -Patrick
> >
> > On May 29, 2008, at 8:42 AM, Craig L Russell wrote:
> >
> >>
> >> On May 28, 2008, at 10:02 AM, Pinaki Poddar wrote:
> >>
> >>>
> >>> Hi Patrick,
> >>>> Historically, we've assumed that for the purposes of refresh(), the
> >>>> data cache is always correct with respect to the database.
> >>>
> >>> Yes, I understand the import/impact of my suggestion to bypass
> >>> DataCache
> >>> altogether for refresh(). And hence I value what you are saying in
> >>> this
> >>> regard. My suggestion is leaned towards spec-compliance (or rather my
> >>> assumption of what the mythical 'user' wants as refresh()
> >>> behavior). The
> >>> spec says for refresh() as in JavaDoc: "Refresh the state of the
> >>> instance
> >>> from the database, overwriting changes made to the entity, if any".
> >>> I read
> >>> it as: "the user wants refresh() to hit the database". You can
> >>> always refute
> >>> that assumption because of the core axiom:" Nobody really knows
> >>> what does
> >>> the user want".
> >>
> >> The issue here is that the data cache is not a specified concept. So
> >> we are really on our own here.
> >>
> >> The concept of refresh is in JPA, but the concept of evict is not.
> >> So I think that refresh should work as advertised with or without
> >> the data cache. Which means that data should be retrieved from the
> >> database (not the cache).
> >>
> >> But clearly, some applications will operate in an environment where
> >> the data cache is always kept current, by the global cache manager.
> >> In these cases, it is sufficient to refresh from the cache and not
> >> go to the database. For this situation, I'd prefer that the user set
> >> a flag in the EntityManagerFactory that says RefreshFromCache if the
> >> global cache manager can guarantee that the cache is kept in sync
> >> with the database.
> >>
> >> And I'd go farther, by asking that this concept be put into the JPA
> >> spec explicitly, if the expert group adopts the cache API.
> >>
> >> Craig
> >>>
> >>>
> >>> The other relevant bit is the flag FORCE_LOAD_REFRESH in
> >>> StoreManager. The
> >>> flag was there and Broker.refreshInternal() used it to call the
> >>> StoreManager.loadAll(). That is the only place the flag had been
> >>> used.
> >>> However, DataCacheStoreManager ignored it. In fact, I used it now
> >>> to bypass
> >>> DataCache for refresh(). May be some one can comment on the
> >>> original intent
> >>> behind that flag?
> >>>
> >>> But, Patrick, you lead a wider discussion on the implicit design
> >>> assumption: "the data cache is always correct with respect to the
> >>> database".
> >>> Going forward, with pessimistic transaction being part of the spec,
> >>> this
> >>> view of DataCache being the 'in-memory replica of database' will be
> >>> harder
> >>> to comply and perhaps requires a revisit.
> >>>
> >>> 2. Inconsistent in-memory model:
> >>> The other issue being discussed in this thread i.e. "inconsistent
> >>> in-memory data model" also tends to show limitation of the view that
> >>> DataCache is a replica of database albeit from a representational
> >>> form.
> >>> Database represents relation/data differently than DataCache does.
> >>> Inconsistent relations, under some mapping, automagically becomes
> >>> consistent
> >>> in the database because while Java (and DataCache) represents a
> >>> bi-directional relation with a pair of variables (and hence leaving
> >>> room for
> >>> inconsistency), database has a single foreign key column and hence
> >>> no scope
> >>> for ambiguity.
> >>> Can we analyze the data in DataCache to detect such data that are
> >>> prone to
> >>> inconsistencies and hence leave them from being cached?  perhaps yes.
> >>> Does it worth it? no, because assuming that data is mostly
> >>> consistent, it
> >>> will penalize the consistent model to save some careless programmer
> >>> who
> >>> could not keep their model consistent.  Also InverseManager already
> >>> can do
> >>> that at L1 cache level, for some inconsistencies.
> >>>
> >>>
> >>> 3. retiring the 'optimization' on refresh
> >>> there were two refreshInternal() methods, one with a single
> >>> instance and
> >>> other with a collection. They had very similar code except a
> >>> boolean flag
> >>> that impacted behavior of a single instance. The collection form used
> >>> loadAll() for better performance and I kept that. Maintaining very
> >>> similar
> >>> (but not the same) code for a single instance versus a collection of
> >>> instance is risky.
> >>>
> >>>
> >>>
> >>>
> >>> Patrick Linskey-2 wrote:
> >>>>
> >>>>> b) bypasses Data cache for refresh() altogether
> >>>>
> >>>> Historically, we've assumed that for the purposes of refresh(), the
> >>>> data cache is always correct with respect to the database. I.e., the
> >>>> assumption has been that if changes were made out-of-band, then the
> >>>> developer would call DataCache.evict() prior to refresh(). It would
> >>>> seem that your suggestion significantly changes this assumption.
> >>>>
> >>>> For the case of Broker.evict() calls, the Broker has a special
> >>>> setting
> >>>> (setEvictFromDataCache()) that allows configuration of whether or
> >>>> not
> >>>> Broker.evict() also calls DataCache.evict(). We should consider this
> >>>> precedent and the prior assumptions when coming up with a strategy
> >>>> for
> >>>> changing the semantics of refresh().
> >>>>
> >>>>> c) throws away 'optimization' for a single entity
> >>>>
> >>>> Can you describe this in more detail? Optimizations are generally
> >>>> there for a reason -- getting rid of it might very well lead to sub-
> >>>> optimal behavior.
> >>>>
> >>>> -Patrick
> >>>>
> >>>> On May 28, 2008, at 9:38 AM, Pinaki Poddar wrote:
> >>>>
> >>>>>
> >>>>> Hi Craig,
> >>>>> Committed revision 660753 towards a model that
> >>>>> a) corrects refresh() behavior of single, clean entity
> >>>>> b) bypasses Data cache for refresh() altogether
> >>>>> c) throws away 'optimization' for a single entity
> >>>>>
> >>>>> Pinaki
> >>>>>
> >>>>>
> >>>>> Craig L Russell wrote:
> >>>>>>
> >>>>>> Hi Pinaki,
> >>>>>>
> >>>>>> On May 27, 2008, at 3:00 PM, Pinaki Poddar wrote:
> >>>>>>
> >>>>>>>
> >>>>>>> After some more analysis of refresh() issue...
> >>>>>>>
> >>>>>>> 1. it is observed that a refresh of a single, clean instance
> >>>>>>> never
> >>>>>>> hits the
> >>>>>>> database -- irrespective of whether Data Cache is active
or not.
> >>>>>>> That does
> >>>>>>> not appear spec compliant.
> >>>>>>
> >>>>>> I agree.
> >>>>>>>
> >>>>>>>
> >>>>>>> 2. refresh() behaves differently on current lock level.
With NO
> >>>>>>> LOCK
> >>>>>>> it
> >>>>>>> reads from Data Cache; on any stronger lock it hits the
database.
> >>>>>>> I am of the opinion that all refresh() must bypass data
cache
> >>>>>>> altogether
> >>>>>>> always -- because refresh() seems to express explicit intent
of
> >>>>>>> the
> >>>>>>> user to
> >>>>>>> read data from database (say when the application thinks
that
> >>>>>>> out-
> >>>>>>> of-
> >>>>>>> band
> >>>>>>> modifications may have taken place on the database).
> >>>>>>
> >>>>>> I agree.
> >>>>>>>
> >>>>>>>
> >>>>>>> 3. There is an 'optimization' on BrokerImpl.refresh() --
one
> >>>>>>> for a
> >>>>>>> single
> >>>>>>> entity and other for a collection. Removing that optimization
> >>>>>>> (which
> >>>>>>> leaves
> >>>>>>> some maintenance concern of similar but not same code blocks)
is
> >>>>>>> another
> >>>>>>> suggestion.
> >>>>>>
> >>>>>> Seems that if the user calls refresh on a single entity or on
a
> >>>>>> collection, then we should hit the database every time. Who
are
> >>>>>> we to
> >>>>>> know that the database hasn't changed in the last millisecond?
> >>>>>> Sure,
> >>>>>> we're smart, but we're not omniscient.
> >>>>>>
> >>>>>> Craig
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Comments?
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> View this message in context:
> >>>>>>> http://www.nabble.com/datacache-tp17326391p17501042.html
> >>>>>>> Sent from the OpenJPA Developers mailing list archive at
> >>>>>>> Nabble.com.
> >>>>>>>
> >>>>>>
> >>>>>> Craig Russell
> >>>>>> Architect, Sun Java Enterprise System http://java.sun.com/
> >>>>>> products/
> >>>>>> jdo
> >>>>>> 408 276-5638 mailto:Craig.Russell@sun.com
> >>>>>> P.S. A good JDO? O, Gasp!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> View this message in context:
> >>>>> http://www.nabble.com/datacache-tp17326391p17502434.html
> >>>>> Sent from the OpenJPA Developers mailing list archive at
> >>>>> Nabble.com.
> >>>>>
> >>>>
> >>>> --
> >>>> Patrick Linskey
> >>>> 202 669 5907
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>> --
> >>> View this message in context:
> >>> http://www.nabble.com/datacache-tp17326391p17517403.html
> >>> Sent from the OpenJPA Developers mailing list archive at Nabble.com.
> >>>
> >>
> >> Craig Russell
> >> Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
> >> 408 276-5638 mailto:Craig.Russell@sun.com
> >> P.S. A good JDO? O, Gasp!
> >>
> >
> > --
> > Patrick Linskey
> > 202 669 5907
> >
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/datacache-tp17326391p17565542.html
> Sent from the OpenJPA Developers mailing list archive at Nabble.com.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message