jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Reschke <julian.resc...@gmx.de>
Subject Re: SPI caching, was: [jira] Resolved: (JCR-1361) Lock test assumes that changes in one session are immediately visible in different session
Date Thu, 07 Feb 2008 13:08:13 GMT
Marcel Reutegger wrote:
>> Example - obtaining a directory listing: SPI2JCR currently gets the 
>> NodeInfo for the collection, then gets the ChildInfo iterator, then 
>> for each NodeId of a child fetches that child's NodeInfo.
>>
>> For a collection of N members, this translates to N additional 
>> roundtrips to the store (with WebDAV, PROPFINDs on each child 
>> resource, although a single PROPFIND with Depth 1 would have been 
>> sufficient).
>>
>> It's not clear to me how it would be able to avoid this with the 
>> current SPI interfaces while disallowing SPI to cache.
> 
> see JCR-1011. we just have to commit the patch.

I think I understand batch read, and how JCR2SPI would use that. What I 
don't see how it helps in this case.

An SPI implementation *could* return ItemInfos for all children when the 
NodeInfo for a collection is fetched, but how would it know that anybody 
wants to see the members?

>> I have the feeling that we're optimizing for the wrong use case here.
>>
>> If we can't make *read* access efficient enough, we're in trouble. And 
>> I really don't want to require every SPI implementation to subscribe 
>> to events from the underlying store, in particular if it's remote 
>> (think HTTP).
> 
> that's why I don't even want to get into this business. but if an 
> implementation wants to cache something it is responsible for 
> maintaining it.

That's a broad statement.

JCR includes "refresh" for good reasons. Are you arguing that it's not 
needed, and a JCR implementation is responsible for that as well?

I think that would be a fundamentally bad idea, because whether cache 
information needs to be fresh depends on what the client does. There's 
no way how the JCR or the SPI implementation would know.

If a client does a collection listing, asking for a limited set of 
properties of the members (name, timestamps, mime type, length), it 
really doesn't care much. However, the SPI implementation has no 
knowledge about the context in which the information in the NodeInfo is 
needed, and thus has no way to optimize the operation.

>> JCR clients today can not rely on fresh session information unless 
>> they do a refresh(), and it's unclear to me why we would require that 
>> from an SPI implementation.
> 
> it is a fundamental requirement that the SPI implementation provides the 
> most up-to-date item that is available. the refresh semantic is only 
> relevant in the context of jcr2spi but not the SPI itself.

Where does this requirement come from? Is it stated somewhere? Did you 
ever try to compare performance between native Jackrabbit, and an SPI 
based solution for operations like the one mentioned above?

>> [...] or just discard the SessionInfo and get a fresh one.
> 
> that's contrary to how the SessionInfo is designed. It is meant to be 
> the result of a successful authentication. If it holds state information 
> that is relevant to the server (e.g. a cache, a JCR session, JDBC 
> connection, ...) it is the responsibility of the implementation to 
> maintain it. An SPI client does not need nor use that information directly.

I didn't claim it does.

> Again any call using a SessionInfo should return the most up-to-date 
> item(s) that are requested.

Requiring this sounds nice in theory, but I'm *very* skeptic that it 
works in practice.

>  > If the JCR client does call "refresh()", we really should pass that
>  > information to SPI, either by a new method (which could be more
>  > elaborate than just refresh() as mentioned by Angela), or [...]
> 
> That's IMO a more relevant use case that we should consider rather than 
> caching.

I'm not sure how this is a different use case, but I really don't care 
for the motivation.

At the end of the day, what we should do is *measure* the performance of 
JCR2SPI compared to native implementations. I'll try to submit a few 
tests soon.

BR, Julian

Mime
View raw message