openjpa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Linskey <plins...@gmail.com>
Subject Re: datacache
Date Thu, 29 May 2008 01:43:19 GMT
>> But, Patrick, you lead a wider discussion on the implicit design
>> assumption: "the data cache is always correct with respect to the  
>> database".
>> Going forward, with pessimistic transaction being part of the spec,  
>> this
>> view of DataCache being the 'in-memory replica of database' will be  
>> harder
>> to comply and perhaps requires a revisit.

Bear in mind that OpenJPA bypasses the cache when using pessimistic  
locking.

>> 2. Inconsistent in-memory model:

As I've mentioned before, I believe that any such inconsistencies  
should be treated as bugs, not design features.

>> Can we analyze the data in DataCache to detect such data that are  
>> prone to
>> inconsistencies and hence leave them from being cached?  perhaps yes.

We can definitely detect such issues.

>> Does it worth it? no, because assuming that data is mostly  
>> consistent, it
>> will penalize the consistent model to save some careless programmer  
>> who
>> could not keep their model consistent.  Also InverseManager already  
>> can do
>> that at L1 cache level, for some inconsistencies.

I do not believe that w can deem up-front and in general that such  
inconsistencies are not worth resolving. I believe that we must look  
at these bugs individually when determining how to resolve things.

>> 3. retiring the 'optimization' on refresh
>> there were two refreshInternal() methods, one with a single  
>> instance and
>> other with a collection. They had very similar code except a  
>> boolean flag
>> that impacted behavior of a single instance. The collection form used
>> loadAll() for better performance and I kept that. Maintaining very  
>> similar
>> (but not the same) code for a single instance versus a collection of
>> instance is risky.

What measurements, if any, did you perform along with the change? What  
is the performance impact of calling the plural method in the single- 
instance case?

-Patrick

On May 29, 2008, at 8:42 AM, Craig L Russell wrote:

>
> On May 28, 2008, at 10:02 AM, Pinaki Poddar wrote:
>
>>
>> Hi Patrick,
>>> Historically, we've assumed that for the purposes of refresh(), the
>>> data cache is always correct with respect to the database.
>>
>> Yes, I understand the import/impact of my suggestion to bypass  
>> DataCache
>> altogether for refresh(). And hence I value what you are saying in  
>> this
>> regard. My suggestion is leaned towards spec-compliance (or rather my
>> assumption of what the mythical 'user' wants as refresh()  
>> behavior). The
>> spec says for refresh() as in JavaDoc: "Refresh the state of the  
>> instance
>> from the database, overwriting changes made to the entity, if any".  
>> I read
>> it as: "the user wants refresh() to hit the database". You can  
>> always refute
>> that assumption because of the core axiom:" Nobody really knows  
>> what does
>> the user want".
>
> The issue here is that the data cache is not a specified concept. So  
> we are really on our own here.
>
> The concept of refresh is in JPA, but the concept of evict is not.  
> So I think that refresh should work as advertised with or without  
> the data cache. Which means that data should be retrieved from the  
> database (not the cache).
>
> But clearly, some applications will operate in an environment where  
> the data cache is always kept current, by the global cache manager.  
> In these cases, it is sufficient to refresh from the cache and not  
> go to the database. For this situation, I'd prefer that the user set  
> a flag in the EntityManagerFactory that says RefreshFromCache if the  
> global cache manager can guarantee that the cache is kept in sync  
> with the database.
>
> And I'd go farther, by asking that this concept be put into the JPA  
> spec explicitly, if the expert group adopts the cache API.
>
> Craig
>>
>>
>> The other relevant bit is the flag FORCE_LOAD_REFRESH in  
>> StoreManager. The
>> flag was there and Broker.refreshInternal() used it to call the
>> StoreManager.loadAll(). That is the only place the flag had been  
>> used.
>> However, DataCacheStoreManager ignored it. In fact, I used it now  
>> to bypass
>> DataCache for refresh(). May be some one can comment on the  
>> original intent
>> behind that flag?
>>
>> But, Patrick, you lead a wider discussion on the implicit design
>> assumption: "the data cache is always correct with respect to the  
>> database".
>> Going forward, with pessimistic transaction being part of the spec,  
>> this
>> view of DataCache being the 'in-memory replica of database' will be  
>> harder
>> to comply and perhaps requires a revisit.
>>
>> 2. Inconsistent in-memory model:
>> The other issue being discussed in this thread i.e. "inconsistent
>> in-memory data model" also tends to show limitation of the view that
>> DataCache is a replica of database albeit from a representational  
>> form.
>> Database represents relation/data differently than DataCache does.
>> Inconsistent relations, under some mapping, automagically becomes  
>> consistent
>> in the database because while Java (and DataCache) represents a
>> bi-directional relation with a pair of variables (and hence leaving  
>> room for
>> inconsistency), database has a single foreign key column and hence  
>> no scope
>> for ambiguity.
>> Can we analyze the data in DataCache to detect such data that are  
>> prone to
>> inconsistencies and hence leave them from being cached?  perhaps yes.
>> Does it worth it? no, because assuming that data is mostly  
>> consistent, it
>> will penalize the consistent model to save some careless programmer  
>> who
>> could not keep their model consistent.  Also InverseManager already  
>> can do
>> that at L1 cache level, for some inconsistencies.
>>
>>
>> 3. retiring the 'optimization' on refresh
>> there were two refreshInternal() methods, one with a single  
>> instance and
>> other with a collection. They had very similar code except a  
>> boolean flag
>> that impacted behavior of a single instance. The collection form used
>> loadAll() for better performance and I kept that. Maintaining very  
>> similar
>> (but not the same) code for a single instance versus a collection of
>> instance is risky.
>>
>>
>>
>>
>> Patrick Linskey-2 wrote:
>>>
>>>> b) bypasses Data cache for refresh() altogether
>>>
>>> Historically, we've assumed that for the purposes of refresh(), the
>>> data cache is always correct with respect to the database. I.e., the
>>> assumption has been that if changes were made out-of-band, then the
>>> developer would call DataCache.evict() prior to refresh(). It would
>>> seem that your suggestion significantly changes this assumption.
>>>
>>> For the case of Broker.evict() calls, the Broker has a special  
>>> setting
>>> (setEvictFromDataCache()) that allows configuration of whether or  
>>> not
>>> Broker.evict() also calls DataCache.evict(). We should consider this
>>> precedent and the prior assumptions when coming up with a strategy  
>>> for
>>> changing the semantics of refresh().
>>>
>>>> c) throws away 'optimization' for a single entity
>>>
>>> Can you describe this in more detail? Optimizations are generally
>>> there for a reason -- getting rid of it might very well lead to sub-
>>> optimal behavior.
>>>
>>> -Patrick
>>>
>>> On May 28, 2008, at 9:38 AM, Pinaki Poddar wrote:
>>>
>>>>
>>>> Hi Craig,
>>>> Committed revision 660753 towards a model that
>>>> a) corrects refresh() behavior of single, clean entity
>>>> b) bypasses Data cache for refresh() altogether
>>>> c) throws away 'optimization' for a single entity
>>>>
>>>> Pinaki
>>>>
>>>>
>>>> Craig L Russell wrote:
>>>>>
>>>>> Hi Pinaki,
>>>>>
>>>>> On May 27, 2008, at 3:00 PM, Pinaki Poddar wrote:
>>>>>
>>>>>>
>>>>>> After some more analysis of refresh() issue...
>>>>>>
>>>>>> 1. it is observed that a refresh of a single, clean instance  
>>>>>> never
>>>>>> hits the
>>>>>> database -- irrespective of whether Data Cache is active or not.
>>>>>> That does
>>>>>> not appear spec compliant.
>>>>>
>>>>> I agree.
>>>>>>
>>>>>>
>>>>>> 2. refresh() behaves differently on current lock level. With NO 

>>>>>> LOCK
>>>>>> it
>>>>>> reads from Data Cache; on any stronger lock it hits the database.
>>>>>> I am of the opinion that all refresh() must bypass data cache
>>>>>> altogether
>>>>>> always -- because refresh() seems to express explicit intent of 

>>>>>> the
>>>>>> user to
>>>>>> read data from database (say when the application thinks that  
>>>>>> out-
>>>>>> of-
>>>>>> band
>>>>>> modifications may have taken place on the database).
>>>>>
>>>>> I agree.
>>>>>>
>>>>>>
>>>>>> 3. There is an 'optimization' on BrokerImpl.refresh() -- one  
>>>>>> for a
>>>>>> single
>>>>>> entity and other for a collection. Removing that optimization  
>>>>>> (which
>>>>>> leaves
>>>>>> some maintenance concern of similar but not same code blocks) is
>>>>>> another
>>>>>> suggestion.
>>>>>
>>>>> Seems that if the user calls refresh on a single entity or on a
>>>>> collection, then we should hit the database every time. Who are  
>>>>> we to
>>>>> know that the database hasn't changed in the last millisecond?  
>>>>> Sure,
>>>>> we're smart, but we're not omniscient.
>>>>>
>>>>> Craig
>>>>>>
>>>>>>
>>>>>>
>>>>>> Comments?
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/datacache-tp17326391p17501042.html
>>>>>> Sent from the OpenJPA Developers mailing list archive at  
>>>>>> Nabble.com.
>>>>>>
>>>>>
>>>>> Craig Russell
>>>>> Architect, Sun Java Enterprise System http://java.sun.com/ 
>>>>> products/
>>>>> jdo
>>>>> 408 276-5638 mailto:Craig.Russell@sun.com
>>>>> P.S. A good JDO? O, Gasp!
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> -- 
>>>> View this message in context:
>>>> http://www.nabble.com/datacache-tp17326391p17502434.html
>>>> Sent from the OpenJPA Developers mailing list archive at  
>>>> Nabble.com.
>>>>
>>>
>>> -- 
>>> Patrick Linskey
>>> 202 669 5907
>>>
>>>
>>>
>>>
>>
>> -- 
>> View this message in context: http://www.nabble.com/datacache-tp17326391p17517403.html
>> Sent from the OpenJPA Developers mailing list archive at Nabble.com.
>>
>
> Craig Russell
> Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
> 408 276-5638 mailto:Craig.Russell@sun.com
> P.S. A good JDO? O, Gasp!
>

-- 
Patrick Linskey
202 669 5907


Mime
View raw message