From jackrabbit-dev-return-1757-apmail-incubator-jackrabbit-dev-archive=www.apache.org@incubator.apache.org Sat Apr 16 23:42:05 2005 Return-Path: Delivered-To: apmail-incubator-jackrabbit-dev-archive@www.apache.org Received: (qmail 82536 invoked from network); 16 Apr 2005 23:42:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 16 Apr 2005 23:42:05 -0000 Received: (qmail 52333 invoked by uid 500); 16 Apr 2005 23:42:04 -0000 Mailing-List: contact jackrabbit-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jackrabbit-dev@incubator.apache.org Delivered-To: mailing list jackrabbit-dev@incubator.apache.org Received: (qmail 52319 invoked by uid 99); 16 Apr 2005 23:42:03 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (hermes.apache.org: local policy) Received: from dio.axelero.hu (HELO dio.axelero.hu) (195.228.240.89) by apache.org (qpsmtpd/0.28) with ESMTP; Sat, 16 Apr 2005 16:42:03 -0700 Received: from 187.26-182-adsl-pool.axelero.hu (187.26-182-adsl-pool.axelero.hu [81.182.26.187]) by dio.axelero.hu (8.13.2/8.13.2) with ESMTP id j3GNfgog013177 for ; Sun, 17 Apr 2005 01:41:48 +0200 (CEST) Date: Sun, 17 Apr 2005 01:41:40 +0200 From: Daniel Dekany X-Mailer: The Bat! (v3.0.1.33) UNREG / CD5BF9353B3B7091 Reply-To: Daniel Dekany X-Priority: 3 (Normal) Message-ID: <71217167.20050417014140@freemail.hu> To: Edgar Poce Subject: Re: Getting "custom" objects from the repository? In-Reply-To: <42618509.4030200@gmail.com> References: <181174235.20050416061620@freemail.hu> <426119F3.4060809@jboss.com> <826865421.20050416205632@freemail.hu> <42618509.4030200@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-VBMilter: scanned X-Virus-Checked: Checked X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Saturday, April 16, 2005, 11:35:05 PM, Edgar Poce wrote: > Hi daniel > > Daniel Dekany wrote: >> I would be happily build a such framework, but I don't see how... JCR >> nodes doesn't even have some kind of automatically maintained >> last-modified property that I could use for quickly checking if the >> object in the cache is outdated or not. It is almost everything that is >> needed for the happiness. Seems to me such a low hanging fruit... >> >> Node n = (Node) session.getItem("/foo/theTemplate"); >> cacheEntry = cache.get("/foo/theTemplate"); >> if (!n.getStamp().equals(cacheEntry.getStamp())) { >> The cached object is outdated, so let's recreate using the >> current value of the insertsomethinghere property. >> } else { >> return cacheEntry.getObject() >> } >> > > I guess you can do it by creating a custom node type with a mandatory > property that stores the last update timestamp. WDYT? Not good. Because: a) If all modification is made through my CMS (which handlers the "mapping"), then I can ensure that the mapping:lastUpdate is always changed when I change a property of the node. But a repository is not only accessed through the interface of the CMS. It's a central content repository that is read and write by various tools. So this solution works for certain users, but it is against the idea of an "enterprise" content repository, as it wouldn't work there. b) I believe that what I'm talking about will turn out to be a very common task, and I will return to this topic later in this mail. Here I'm just saying that there should be correct, standard, easy way of doing this. For example, the specification should introduce nt:monitored (don't deal with the poor name choice for now...), that should mean that the node has these properties: - jcr:uuid - jcr:modificationCounter which is automatically created and initialized to 0, and whose value is *automatically* incremented whenever a property of the node is written. Yeah, there are lot of unclear things about this, it was just a quick starting-point idea. This would be enough to implement a client side object cache ("mapping") that I have talked about. If the uuid or modificationCounter of the node returned for "foo/index" is changed, then the cached object shouldn't be used. Furthermore I think that there should be a method in the JCR API for this check, so implementations can optimize this (IMO) frequent task. Something like this could be an optional feature. Then it will turn out if customers will want the JCR implementators to support this feature or not. > the first line of your example would be something like: > Property p = (Property) > session.getItem("/foo/theTemplate/mapping:lastUpdate"); > > But since graffito, lenya and jackrabbit communities are interested in > such a tool it would be cool to work together, that would be the apache > way, right? :). AFAIK the proposal discussed in the graffito dev list > deals with many of the issues you are talking but I didn't see any > reference to a cache with already mapped objects. Excuse me being a smart Aleck and telling what I think about the whole issue, and the (apparent?) lack of comprehension about the issue: People already use these RDBMS-es for ages, and then later they have built these "object mapping" layers over what he already had, like Hibernate. Now with JCR it seems that people think that it's OK if they say that JCR is analogous with JDBC, and since Hibernate is working relatively well, the same trick (adding an object mapping layer later) will work in the case JCR as well. I belive its a blind thinking. Look at Hibernate how and why it works without the caching problems I'm crying about here. You run a query and you get bean instances (instead of the low-level ResultSet). Whenever you run a query, new bean instances are created. There is no caching (and thus no cache that can go out if sync with the storage). No caching is needed because creating a new simple bean and setting its properties has no significant resource consumption compared to what the DB query eats. (And if you worry about to much garbage, you can use instance pools, because re-setting property values is easy, so you can reuse instances.) Yes, this surely will work with JCR as well. The problem is that if we are talking about content repositories and CMS-es and Web, then I it will turn out that there is a strong tendency of storing big complex objects in the storage (usually stored as BLOB-s, that is, binary properties, or as long string properties), not just simple tables that can be easily modelled with those lightweight beans. Again, look at a ZODB (Zope) what's stored in it... lot of objects that you definitely don't want to shamelessly instantiate and then very soon drop, again and again, like you did with those beans. For example, you store templates here, and scripts... creating a template or a script object eats many resources, and not only because these objects are big and complex, but because creating the instance may involves tasks like parsing the template/script (written in Velocity language, Groovy, etc) and such. Same goes for other "serious" objects as well, like XML documents (you will store it with a string or binary property, but certainly you will have to parse that to DOM tree before you can actually use it). A content repository is a bit like the file system on your HDD: you store many big/complex files there (lot of BLOB-s with RDBMS wording), not just a heap of simple tables. Furthermore, since these objects are relatively big in the storage, just to get these objects from the content repository is expensive in itself because of the I/O needed (and/or because of the memory needed if they are in the RAM cache). Rather you should just get the values of jcr:uuid and jcr:modificationCounter (see these earlier), and get the primary property (typically jcr:content) only if they have changed. Last not least, these objects in the repository are usually seldom changed (consider how often you modify page templates compared to the records that store the current stocks), so it is just obvious that you should cache them inside your CMS, or inside whatever that is the client to the content repository. These objects are used frequently (like you will run the page template of a frequently visited page a lot), and modified seldom. -- Best regards, Daniel Dekany