jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Bologna <alessandro.bolo...@gmail.com>
Subject Re: Importing and Exporting XML
Date Wed, 13 Jun 2007 11:08:23 GMT

I agree with what Dan said. Another approach to customize your import  
and export that you may find useful can be to use (and extend  
accordingly) a couple of classes found in the contrib section of the  
Jackrabbit SVN.

In particular, you may want to look at the DocumentViewExportVisitor  
and DocumentViewImportVisitor. You need to understand how SAX parsing  
works a bit, but it should be easy enough to customize them to select  
what attributes you want to see in your DocumentView export, and how  
to map your XML into Jackrabbit nodes and properties.
Incidentally, those classes supports multivalued properties  
represented as space separated attributes.

About the SystemView, it is really not meant to be human readable,  
more like machine readable (hence the name SystemView), and can be a  
bit misleading, because it breaks the node/element mapping that JCR  
supports (in the JSR-170 specs is defined as VirtualDocument). In  
other words, the xpath expression "/foo/bar" in system view becomes  
something like /sv:node[@sv:name='foo']/sv:node[@sv:name='bar']/ 
sv:value Or something like that...

In my current project, I have implement a REST type interface to the  
JCR, so that the XSLT document() element can be used to access JCR  
subtrees, and it is very important to preserve the mapping between  
xpath in XML fragments (constructed using the  
DocumentViewExportVisitor) and the xpath in the JCR, so the  
SystemView does not really apply.

Hope it helps.

On Jun 13, 2007, at 6:16 AM, Dan Connelly wrote:

> woolly:
> The term "DocumentView" is slightly misleading.    Its more like a  
> Shredded And Annotated Document View.
> The xml document will get shredded into its constituent element  
> nodes when you import it as "DocumentView".   This import will not  
> store a single, coherent document in the Repository.    WebDav  
> support in Jackrabbit, on the other hand, can be used to store the  
> document as coherent text.     Customized, hybrid approaches are  
> possible to support structured content (partial shredding over  
> WebDav).  It depends on your use case how much (or how little)  
> shredding you want.
> The metadata gets added to raw shreds during DocumentView import to  
> indicate the Jackrabbit element node type structure.    By default,  
> node type will be nt:unstructured on raw nodes (not having metadata  
> already).   You can write a simple XSLT to strip out the metadata  
> when you export.   For import you can work this in reverse and add  
> a custom structure using XSLT (but that may not be simple).
> It sounds like your use case (customized node editing) requires  
> some custom node types.   This can work nicely if the set of  
> element tags is limited and fixed.   Also, you probably also will  
> need to add some custom xml processing (dom, sax or xslt).
> What xml editor are you using?   I think XML Spy has integration  
> features that would support partial shredding and customized  
> document views.   (But, I have never worked this.)
>    -- Dan Connelly
> woolly wrote:
>> Hi all,
>> Is it possible to import xml into a node, and then export that xml  
>> back out
>> to have the same xml-equivalent file? At the moment I'm trying:
>> fis = new FileInputStream(inputFile);
>> session.importXML(node.getPath(), fis,
>> fis.close();
>> // followed by....
>> out = new FileOutputStream(outputFile);
>> session.exportDocumentView(node.getPath(), out, true, false);
>> The difference between inputFile and outputFile seems to be that  
>> there are
>> some additional jcr specific attributes. Is this necessary?
>> What I'm really trying to do is manage an xml document (eventually  
>> many xml
>> documents), allow people to make changes to only certain parts of it,
>> versioning those parts and using other JackRabbit features. Is  
>> this the kind
>> of thing that JackRabbit was intended for? Or should I just load  
>> the xml
>> document in as a property of a node and deal with the other things  
>> myself?
>> Thanks for any help,
>> Phil.

View raw message