Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 39502 invoked from network); 23 Feb 2008 10:42:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 23 Feb 2008 10:42:37 -0000 Received: (qmail 30606 invoked by uid 500); 23 Feb 2008 10:42:31 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 30585 invoked by uid 500); 23 Feb 2008 10:42:31 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 30576 invoked by uid 99); 23 Feb 2008 10:42:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Feb 2008 02:42:31 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of julian.reschke@gmx.de designates 213.165.64.20 as permitted sender) Received: from [213.165.64.20] (HELO mail.gmx.net) (213.165.64.20) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 23 Feb 2008 10:41:44 +0000 Received: (qmail invoked by alias); 23 Feb 2008 10:42:04 -0000 Received: from p508FA3F0.dip0.t-ipconnect.de (EHLO [192.168.178.22]) [80.143.163.240] by mail.gmx.net (mp050) with SMTP; 23 Feb 2008 11:42:04 +0100 X-Authenticated: #1915285 X-Provags-ID: V01U2FsdGVkX1/FLK5li+fFsEuk9ChFc+Y9VsUNblT7KMnEn8360y 74/CcQdgl0XHcn Message-ID: <47BFF870.7040804@gmx.de> Date: Sat, 23 Feb 2008 11:41:52 +0100 From: Julian Reschke User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.0.4) Gecko/20060516 Thunderbird/1.5.0.4 Mnenhy/0.7.4.666 MIME-Version: 1.0 To: dev@jackrabbit.apache.org Subject: Re: [jira] Commented: (JCR-1405) SPI: Introduce NodeInfo.getChildInfos() References: <1984270253.1203760159309.JavaMail.jira@brutus> In-Reply-To: <1984270253.1203760159309.JavaMail.jira@brutus> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-Virus-Checked: Checked by ClamAV on apache.org ...continuing on the mailing list; I think this exceeds what an issue tracker is good for. angela (JIRA) wrote: > angela commented on JCR-1405: > ----------------------------- > > julian: > > if you cant determine the childinfos upon creating the nodeinfo you should (as stated by the javadoc) simply return null, Trying to determine the child infos in practice means asking the underlying storage for them. If there are 1000 children, and I get an internal error for the last, I would then have to return null. Which means that JCR2SPI asks again, using RepositoryService. Not good. The only case where this new method would actually help is where the set of child node names is known in advance, such as for nodes of type nt:file. It's nice to be able to optimize those, but not sufficient. We started the discussion because of the horrific performance of JCR2SPI for large collections (where it currently reaches something around 2% of what my persistence layer can do). Are we still trying to solve this? > if you cant build the nodeinfo due to some exceptional situation you should throw upon getNodeInfo or getItemInfos > respectively. > > the exception with repositoryservice getChildInfo means the same as the one defined with getNodeInfo or getItemInfos: > - the target node does not exist (any more) in the persistent state > - the persistent layer cant be accessed or something similar. Well. If the *construction* of the NodeInfo now requires to decide whether to return child infos or not, then this change doesn't help, because it doesn't scale for large collections. I'm not going to retrieve child information unless somebody asks for it -- and that is when NodeInfo.getChildInfos is called, not when the NodeInfo is constructed. > therefore i am with marcels explanation how nodeinfo should be created and work. > > in addition, if you decide to do some lazy loading of the childinfos upon NodeInfo.getChildInfos (or upon RepositoryService.getChildInfos) the exception from my point of view is not raised upon building the iterator but upon retrieving the next element.... and there you wont be able to throw repository exception either. ...which may be an indication that a generic Iterator is not the right thing to use either. > regarding "large": > this is just one obvious example what could be a reason for the implementation NOT to reveal > the child infos upon NodeInfo.getChildInfos. and the description mentions this as example. Again; I started this discussion because of the performance for large collections. You seem to try to solve an entirely different problem -- do we have any data that indicates that it's worth solving? How exactly is it better than batch read? > that it states: if the impl is not willing. > > Not willing means that the SPI implementations decides upon internal rules whether the > childinfos are included or not. examples: the impl. decides > > - based on the internal structure of the persistent layer in general > - based the cost of retrieving childinfos (given the potential chance of never being asked for) See -- that's the problem. It seems to me what we really need is a way to indicate that the children *will* be needed. > - based on the known characteristics of the target node: e.g. we have folder and files and other nodes > and we assume that folders will be used for displaying the children so send it. for any other nodes we dont See above -- doesn't work in practice. > - based on the simple amount of child nodes if we know that (dont calc if more than 14) > - based on a implementation specific configuration > that could include nodetypes, number of child nodes, day time, session.userId, random... whatever > you feel would be appropriate, reasonable or simply a good thing for your specific store. > > the last is pretty much what we discussed for the getItemInfos method for the batch read. we said > that we cant add a config to the spi interfaces and want to leave that to the impl because we would > not be able to find something that fits the needs for all potential implementations. I do agree that the SPI impl needs to decide on things like that. But we have to give it sufficient information. > if your store cant retrieve the child info you may > - create your reposervice with a config and leave the decision to someone else > - always calculate the child infos Again, that doesn't work for the use case we're trying to solve. Or at least the one I thought we're trying to solve. > - never calculate the child infos > - decide based on the characteristics of the requested node > -... > (see above) > > so. i am not in favor of adding exceptions to the new method... at least not for the reasons presented so far. > angela I'm in favor to first clearly state what we're trying to do; then create tests for obtaining measurements; and then re-discuss what needs to be done. BR, Julian