jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Hartmann <andr...@apache.org>
Subject Re: JCR & thesis
Date Tue, 01 Apr 2008 08:02:45 GMT
Hi Bertrand,

Bertrand Delacretaz schrieb:
> On Mon, Mar 31, 2008 at 10:49 PM, Andreas Hartmann <andreas@apache.org> wrote:
> 
>> ... Find all documents containing the XPath
>>  //a[local-name() = 'xhtml' and namespace-uri = 'http://...' and
>>  starts-with(@href,'lenya-document:c2c38f30-ff68-11dc-9682-9dea3e2477d4)]
>>  That would be typical to find links that would be broken after a
>>  document is removed from the live site. I know that JCR doesn't support
>>  this directly - I guess this is where XML DBs shine. With JCR, is it
>>  necessary to traverse all documents and query the content using XPath,
>>  or is there a better solution?...
> 
> That's a typical case where the content model makes all the
> difference: if each link is a JCR Item (a soft or hard reference
> property for example), instead of being embedded in the content,
> finding them is very efficient.
> 
> That might require some processing when saving documents, with the
> benefit of a much richer content structure.

just for my understanding: Before saving I would parse the document, 
extract all internal links and add them to a "outgoingLinks" multi-value 
property? This makes a lot of sense. We could even add this feature to 
our current Lenya repository (we have multi-value meta data). Thanks for 
the hint!

-- Andreas

> 
> Such an example shows how hard it is to compare storage technologies,
> and how important it is to publish the complete source code used for
> tests, so that experts of each technology can have a look and comment
> on what could be improved.
> 
> -Bertrand
> 


-- 
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
Tel.: +41 (0) 43 818 57 01


Mime
View raw message