jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Toper (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-569) WorkspaceImporter Refactoring
Date Wed, 13 Sep 2006 17:41:24 GMT
    [ http://issues.apache.org/jira/browse/JCR-569?page=comments#action_12434496 ] 
            
Nicolas Toper commented on JCR-569:
-----------------------------------

Sure. Here it is:

Introduction
The current o.a.j.core.xml.WorspaceImporter class has one main responsability: to import data
provided by a ContentHandler to the repository. It needs to check imported data to maintain
consistancy and coherence of the repository and in some cases rejects them.

The calling stack is:
- WorkspaceImporter implements Importer
- SysViewImportHandler
- SaxParser (and various classes associated)
- Other classes

The Importer interface is composed of 4 methods: start, end, startNode, endNode. The current
WorkspaceImporter contains only two protected classe

Currently the code is complex to grasp and have some issues I would like to solve with this
iteration of the backup tool (I am using this class heavily).

Issues I am trying to solve:
- Methods are too long: there are 7 methods in this class for 619 lines. In average one method
per 88 lines. This goes with the 4 levels of if/else. This code is hard to debug.

- Complexity: there is no clear separation of concerns between methods. This is partly due
to the Importer interface quite close to the XML. For instance, startNode() checks the validity
of the import ad endNode do. We need to create more methods giving a role to each one.

- Factorizing/Code duplicate: Right now we check several times the same thing and sometimes
not at all. For instance we check the sanity of the workspace at the end of each node and
at the end of the import. We also check several times if a node can be added without disrupting
the consistancy of the repository. We should have a private dedicated method for that.

- Bugs: jcr:root is not handled correctly and other bugs have been detected. Since the code
is hardly readable,

-  Lack of flexibility: I will give just an example. Because of this monolothic design (only
7 methods), the checks are embedded in each method (cf. upper). I had to subclass the 4 methods
of the Importer interface. Also the WorkspaceImporter initiates versioning if needed and remap
some UUID all the time.

Solution
My proposal is in four parts:

- Add some private methods to abstract some of the complex issue this class is managing

- Add more control in the constructor (for instance for our backup tool): add a constructor
with the BatchedItemOperation so we can choose if needed the needed ItemStateManager. (For
instance to restore the version history).

- Add a raw flag in the constructor to import the content as it is without initializing versioning
and escaping some checks. This is a use case already encountered

- With this new design, solve remaining bugs and handle the jcr:root issue.

This new class will be subclassed by WorkspaceImporter to handle the workspace (I am not using
it for now) and for the sanityCheck(): therefore it will be only a few lines and we will have
good SoC.

Please have a look as the patch proposed earlier to have a look at the design and the new
code (it is far from over and a lot of methods are incomplete).

BR
Nicolas

> WorkspaceImporter Refactoring
> -----------------------------
>
>                 Key: JCR-569
>                 URL: http://issues.apache.org/jira/browse/JCR-569
>             Project: Jackrabbit
>          Issue Type: Improvement
>            Reporter: Nicolas Toper
>         Attachments: SysViewImporter.patch
>
>
> Hi,
> As you know, I have run into an issue with the backup tool using the WorkspaceImporter.
I ended up copy/pasting large body of code since the current WorkspaceImporter was not flexible
enough for my use (in my class called SysViewImporter). This solution was perfectly valid
for Google SoC (a lot of time constraints) but unacceptable in the long run for any project
(we hate large body of duplicate code :p).
> Also, some issues have been raised with this class (i.e. jcr:root importation, allowsSameNameSiblings
issue). 
> Overall I feel this class  is circumvoluted and really hard to understand. For instance,
the current code contains at most 5 imbricated if and uses a lot of different ways to pass
information (stacks, objects set on null). 
> I tried to refactor my SysViewImporter and the WorkspaceImporter but it implies a new
code for the WorkspaceImporter and the SysViewImporter. Here is its skeleton! I first wanted
to gather the community feedback before stepping in. I tried moving all "work code" away from
the startNode method and reorganise it for readilibility.
> Please give me your feedback.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message