openjpa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pinaki Poddar <ppod...@apache.org>
Subject Re: datacache
Date Fri, 30 May 2008 18:34:21 GMT

1. Added a boolean (dynamic) property openjpa.RefreshFromDataCache. It
defaults to false. But if set to true, DataCache will behaves as it were
prior to the changes being discussed in this thread.

 

Patrick Linskey-2 wrote:
> 
>>> But, Patrick, you lead a wider discussion on the implicit design
>>> assumption: "the data cache is always correct with respect to the  
>>> database".
>>> Going forward, with pessimistic transaction being part of the spec,  
>>> this
>>> view of DataCache being the 'in-memory replica of database' will be  
>>> harder
>>> to comply and perhaps requires a revisit.
> 
> Bear in mind that OpenJPA bypasses the cache when using pessimistic  
> locking.
> 
>>> 2. Inconsistent in-memory model:
> 
> As I've mentioned before, I believe that any such inconsistencies  
> should be treated as bugs, not design features.
> 
>>> Can we analyze the data in DataCache to detect such data that are  
>>> prone to
>>> inconsistencies and hence leave them from being cached?  perhaps yes.
> 
> We can definitely detect such issues.
> 
>>> Does it worth it? no, because assuming that data is mostly  
>>> consistent, it
>>> will penalize the consistent model to save some careless programmer  
>>> who
>>> could not keep their model consistent.  Also InverseManager already  
>>> can do
>>> that at L1 cache level, for some inconsistencies.
> 
> I do not believe that w can deem up-front and in general that such  
> inconsistencies are not worth resolving. I believe that we must look  
> at these bugs individually when determining how to resolve things.
> 
>>> 3. retiring the 'optimization' on refresh
>>> there were two refreshInternal() methods, one with a single  
>>> instance and
>>> other with a collection. They had very similar code except a  
>>> boolean flag
>>> that impacted behavior of a single instance. The collection form used
>>> loadAll() for better performance and I kept that. Maintaining very  
>>> similar
>>> (but not the same) code for a single instance versus a collection of
>>> instance is risky.
> 
> What measurements, if any, did you perform along with the change? What  
> is the performance impact of calling the plural method in the single- 
> instance case?
> 
> -Patrick
> 
> On May 29, 2008, at 8:42 AM, Craig L Russell wrote:
> 
>>
>> On May 28, 2008, at 10:02 AM, Pinaki Poddar wrote:
>>
>>>
>>> Hi Patrick,
>>>> Historically, we've assumed that for the purposes of refresh(), the
>>>> data cache is always correct with respect to the database.
>>>
>>> Yes, I understand the import/impact of my suggestion to bypass  
>>> DataCache
>>> altogether for refresh(). And hence I value what you are saying in  
>>> this
>>> regard. My suggestion is leaned towards spec-compliance (or rather my
>>> assumption of what the mythical 'user' wants as refresh()  
>>> behavior). The
>>> spec says for refresh() as in JavaDoc: "Refresh the state of the  
>>> instance
>>> from the database, overwriting changes made to the entity, if any".  
>>> I read
>>> it as: "the user wants refresh() to hit the database". You can  
>>> always refute
>>> that assumption because of the core axiom:" Nobody really knows  
>>> what does
>>> the user want".
>>
>> The issue here is that the data cache is not a specified concept. So  
>> we are really on our own here.
>>
>> The concept of refresh is in JPA, but the concept of evict is not.  
>> So I think that refresh should work as advertised with or without  
>> the data cache. Which means that data should be retrieved from the  
>> database (not the cache).
>>
>> But clearly, some applications will operate in an environment where  
>> the data cache is always kept current, by the global cache manager.  
>> In these cases, it is sufficient to refresh from the cache and not  
>> go to the database. For this situation, I'd prefer that the user set  
>> a flag in the EntityManagerFactory that says RefreshFromCache if the  
>> global cache manager can guarantee that the cache is kept in sync  
>> with the database.
>>
>> And I'd go farther, by asking that this concept be put into the JPA  
>> spec explicitly, if the expert group adopts the cache API.
>>
>> Craig
>>>
>>>
>>> The other relevant bit is the flag FORCE_LOAD_REFRESH in  
>>> StoreManager. The
>>> flag was there and Broker.refreshInternal() used it to call the
>>> StoreManager.loadAll(). That is the only place the flag had been  
>>> used.
>>> However, DataCacheStoreManager ignored it. In fact, I used it now  
>>> to bypass
>>> DataCache for refresh(). May be some one can comment on the  
>>> original intent
>>> behind that flag?
>>>
>>> But, Patrick, you lead a wider discussion on the implicit design
>>> assumption: "the data cache is always correct with respect to the  
>>> database".
>>> Going forward, with pessimistic transaction being part of the spec,  
>>> this
>>> view of DataCache being the 'in-memory replica of database' will be  
>>> harder
>>> to comply and perhaps requires a revisit.
>>>
>>> 2. Inconsistent in-memory model:
>>> The other issue being discussed in this thread i.e. "inconsistent
>>> in-memory data model" also tends to show limitation of the view that
>>> DataCache is a replica of database albeit from a representational  
>>> form.
>>> Database represents relation/data differently than DataCache does.
>>> Inconsistent relations, under some mapping, automagically becomes  
>>> consistent
>>> in the database because while Java (and DataCache) represents a
>>> bi-directional relation with a pair of variables (and hence leaving  
>>> room for
>>> inconsistency), database has a single foreign key column and hence  
>>> no scope
>>> for ambiguity.
>>> Can we analyze the data in DataCache to detect such data that are  
>>> prone to
>>> inconsistencies and hence leave them from being cached?  perhaps yes.
>>> Does it worth it? no, because assuming that data is mostly  
>>> consistent, it
>>> will penalize the consistent model to save some careless programmer  
>>> who
>>> could not keep their model consistent.  Also InverseManager already  
>>> can do
>>> that at L1 cache level, for some inconsistencies.
>>>
>>>
>>> 3. retiring the 'optimization' on refresh
>>> there were two refreshInternal() methods, one with a single  
>>> instance and
>>> other with a collection. They had very similar code except a  
>>> boolean flag
>>> that impacted behavior of a single instance. The collection form used
>>> loadAll() for better performance and I kept that. Maintaining very  
>>> similar
>>> (but not the same) code for a single instance versus a collection of
>>> instance is risky.
>>>
>>>
>>>
>>>
>>> Patrick Linskey-2 wrote:
>>>>
>>>>> b) bypasses Data cache for refresh() altogether
>>>>
>>>> Historically, we've assumed that for the purposes of refresh(), the
>>>> data cache is always correct with respect to the database. I.e., the
>>>> assumption has been that if changes were made out-of-band, then the
>>>> developer would call DataCache.evict() prior to refresh(). It would
>>>> seem that your suggestion significantly changes this assumption.
>>>>
>>>> For the case of Broker.evict() calls, the Broker has a special  
>>>> setting
>>>> (setEvictFromDataCache()) that allows configuration of whether or  
>>>> not
>>>> Broker.evict() also calls DataCache.evict(). We should consider this
>>>> precedent and the prior assumptions when coming up with a strategy  
>>>> for
>>>> changing the semantics of refresh().
>>>>
>>>>> c) throws away 'optimization' for a single entity
>>>>
>>>> Can you describe this in more detail? Optimizations are generally
>>>> there for a reason -- getting rid of it might very well lead to sub-
>>>> optimal behavior.
>>>>
>>>> -Patrick
>>>>
>>>> On May 28, 2008, at 9:38 AM, Pinaki Poddar wrote:
>>>>
>>>>>
>>>>> Hi Craig,
>>>>> Committed revision 660753 towards a model that
>>>>> a) corrects refresh() behavior of single, clean entity
>>>>> b) bypasses Data cache for refresh() altogether
>>>>> c) throws away 'optimization' for a single entity
>>>>>
>>>>> Pinaki
>>>>>
>>>>>
>>>>> Craig L Russell wrote:
>>>>>>
>>>>>> Hi Pinaki,
>>>>>>
>>>>>> On May 27, 2008, at 3:00 PM, Pinaki Poddar wrote:
>>>>>>
>>>>>>>
>>>>>>> After some more analysis of refresh() issue...
>>>>>>>
>>>>>>> 1. it is observed that a refresh of a single, clean instance
 
>>>>>>> never
>>>>>>> hits the
>>>>>>> database -- irrespective of whether Data Cache is active or not.
>>>>>>> That does
>>>>>>> not appear spec compliant.
>>>>>>
>>>>>> I agree.
>>>>>>>
>>>>>>>
>>>>>>> 2. refresh() behaves differently on current lock level. With
NO  
>>>>>>> LOCK
>>>>>>> it
>>>>>>> reads from Data Cache; on any stronger lock it hits the database.
>>>>>>> I am of the opinion that all refresh() must bypass data cache
>>>>>>> altogether
>>>>>>> always -- because refresh() seems to express explicit intent
of  
>>>>>>> the
>>>>>>> user to
>>>>>>> read data from database (say when the application thinks that
 
>>>>>>> out-
>>>>>>> of-
>>>>>>> band
>>>>>>> modifications may have taken place on the database).
>>>>>>
>>>>>> I agree.
>>>>>>>
>>>>>>>
>>>>>>> 3. There is an 'optimization' on BrokerImpl.refresh() -- one
 
>>>>>>> for a
>>>>>>> single
>>>>>>> entity and other for a collection. Removing that optimization
 
>>>>>>> (which
>>>>>>> leaves
>>>>>>> some maintenance concern of similar but not same code blocks)
is
>>>>>>> another
>>>>>>> suggestion.
>>>>>>
>>>>>> Seems that if the user calls refresh on a single entity or on a
>>>>>> collection, then we should hit the database every time. Who are 

>>>>>> we to
>>>>>> know that the database hasn't changed in the last millisecond?  
>>>>>> Sure,
>>>>>> we're smart, but we're not omniscient.
>>>>>>
>>>>>> Craig
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Comments?
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> View this message in context:
>>>>>>> http://www.nabble.com/datacache-tp17326391p17501042.html
>>>>>>> Sent from the OpenJPA Developers mailing list archive at  
>>>>>>> Nabble.com.
>>>>>>>
>>>>>>
>>>>>> Craig Russell
>>>>>> Architect, Sun Java Enterprise System http://java.sun.com/ 
>>>>>> products/
>>>>>> jdo
>>>>>> 408 276-5638 mailto:Craig.Russell@sun.com
>>>>>> P.S. A good JDO? O, Gasp!
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> -- 
>>>>> View this message in context:
>>>>> http://www.nabble.com/datacache-tp17326391p17502434.html
>>>>> Sent from the OpenJPA Developers mailing list archive at  
>>>>> Nabble.com.
>>>>>
>>>>
>>>> -- 
>>>> Patrick Linskey
>>>> 202 669 5907
>>>>
>>>>
>>>>
>>>>
>>>
>>> -- 
>>> View this message in context:
>>> http://www.nabble.com/datacache-tp17326391p17517403.html
>>> Sent from the OpenJPA Developers mailing list archive at Nabble.com.
>>>
>>
>> Craig Russell
>> Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
>> 408 276-5638 mailto:Craig.Russell@sun.com
>> P.S. A good JDO? O, Gasp!
>>
> 
> -- 
> Patrick Linskey
> 202 669 5907
> 
> 
>  
> 

-- 
View this message in context: http://www.nabble.com/datacache-tp17326391p17565542.html
Sent from the OpenJPA Developers mailing list archive at Nabble.com.


Mime
View raw message