chemistry-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bogdan Stefanescu <bstefane...@nuxeo.com>
Subject Re: Java Client/Adbera (was: Re: CMIS Client API)
Date Fri, 05 Jun 2009 09:10:35 GMT

Hi all,

I will add more about the reasons on why the existing atom pub client  
is not using Abdera.

As Florent said, the existing client was written in a hurry for a  
client and it was not aligned yet with last chemistry API.
The objective was to build a very responsive rich client  (based on  
eclipse RCP framework) to be used to view/edit/publish stories by  
people writing news.
Here are two requirements on the application  (to have an idea about  
the application constraints - e.g. performance / memory /  
responsiveness constraints):
- People should be able to fetch news raw text (by using atom client),  
create / edit news and  publish them as faster as possible (several  
second to create and publish a news)
- The application should display many views on remote feeds (loaded  
using the atom client)
- Feed views must be refreshed in a 2 seconds interval.

Initially, I started using abdera but for several reasons (that I  
explain below) I decided that it was not appropriate for the type of  
application I wanted to build.
So, lets see how can we use abdera to build a chemistry object model  
implementation for an atom client. We have 2 choices:
1. Either you wrap abdera objects in your own chemistry objects
2. Either you use abdera to parse the feed and build your chemistry  
objects that are totally detached from abdera objects.

Let discuss both of these 2 approaches. I will start with the worst one.

1. Wrap abdera objects in  chemistry object
This was my first approach. Here are the pros/cons:
Pros:
- Simplify a bit the implementation of the chemistry model - atom  
validation included. No need to use Stax (or SAX) code to read your  
objects from the remote feed.

   In fact the simplification added by abdera is relative. You still  
need to write code to parse your CMIS objects from abdera DOM.
   Feed and entry parsing are anyway not complicated (atom is a nice  
and simple format).
   So the only thing abdera is really providing is atom validation and  
an atom aware XML DOM. The rest should be implemented anyway (like the  
CMIS Object parsing from the abdera DOM).
If you don't want atom validation then using another XML DOM library  
will be the same from chemistry code perspective - where you need to  
parse the CMIS object.
Here is a link to benchmarks on several XML DOM parsers including  
AXIOM (the one used by Abdera):
http://www.xml.com/lpt/a/1703


Cons:
- Add to your application many extra dependencies (If I remember well  
3 or 4 abdera JARs + 2 axiom JARs)
- Your CMIS objects will be larger (embed Abdera objects which  
contains additional data not used by the CMIS code).
- Debug is difficult.
An annoying side effect is that debug becomes difficult. When you are  
introspecting CMIS objects (that wraps abdera objects) you will need  
to introspect abdera objects that are based on AXIOM model which is a  
lazy DOM model (it is reading XML data into the DOM object only when  
required). To understand what your object contain you need to  
understand the AXIOM model.
- A technical issue I had with AXIOM way of doing things.
I will describe it here:
As mentioned above, AXIOM is loading data from XML into the DOM only  
at client demand. For example if the client don't need to access the  
30th entry in the feed the data of that entry will not be read from  
XML input stream.
This is a very interesting AXIOM feature that I like but this feature  
has a side effect in my application case.
Because AXIOM read the input stream only at demand it requires to have  
the input stream opened until you read all the data you want from the  
stream.
This means if you close the stream before the UI is completely updated  
you will have an exception like this one:


Exception in thread "main" org.apache.abdera.parser.ParseException:  
java.lang.RuntimeException: [was class java.io.IOException] Stream  
closed
	at org.apache.abdera.parser.stax.FOMBuilder.next(FOMBuilder.java:260)
	at  
org 
.apache 
.axiom.om.impl.llom.OMElementImpl.getNextOMSibling(OMElementImpl.java: 
265)
	at  
org 
.apache 
.axiom 
.om 
.impl 
.traverse.OMChildrenQNameIterator.next(OMChildrenQNameIterator.java:93)
	at  
org 
.apache 
.abdera 
.parser 
.stax 
.util.FOMElementIteratorWrapper.next(FOMElementIteratorWrapper.java:41)
	at org.apache.abdera.parser.stax.util.FOMList.buffer(FOMList.java:74)
	at org.apache.abdera.parser.stax.util.FOMList.size(FOMList.java:88)
	at  
org 
.nuxeo 
.chemistry 
.client.app.test.TestAbderaConn.parseWithAbdera(TestAbderaConn.java:61)
	at  
org 
.nuxeo 
.chemistry.client.app.test.TestAbderaConn.main(TestAbderaConn.java:39)
Caused by: java.lang.RuntimeException: [was class java.io.IOException]  
Stream closed
	at  
com 
.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java: 
18)
	at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:706)
	at  
com 
.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java: 
3655)
	at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java: 
809)
	at  
org 
.apache 
.axiom.om.impl.builder.StAXBuilder.createOMText(StAXBuilder.java:245)
	at  
org 
.apache 
.axiom.om.impl.builder.StAXBuilder.createOMText(StAXBuilder.java:216)
	at  
org 
.apache.abdera.parser.stax.FOMBuilder.applyTextFilter(FOMBuilder.java: 
158)
	at org.apache.abdera.parser.stax.FOMBuilder.next(FOMBuilder.java:206)
	... 7 more

And in order to have a responsive application you need to update a  
feed refresh asynchronously from the UI thread. You cannot control  
when the stream is closed and when the UI is completely loaded.

So, this cool feature of  AXIOM makes the AXIOM DOM unusable (or  
hardly usable) in live objects displayed in rich client applications.

Let's look at the second option of integrating abdera


2. Using Abdera to parse the feed and build your chemistry objects  
that are totally detached from abdera objects.

Pros:
- Provides ATOM validation and ATOM oriented DOM objects.
- Easy way to parse feeds and CMIS objects using the high level Adbera  
DOM.
- Efficient way of parsing the feed input stream due to "load on  
demand" feature of AXIOM. (sections in the feed you are not interested  
in will not be loaded in memory)

Cons:
- Extra dependencies required by the applications as mentioned above  
(~6 extra JARs)

This is an acceptable approach. The only issue for me are the extra  
dependencies. To parse an atom feed you need 6 jars!?
It's true that for a chemistry client - parsing the feed is not  
interesting at all. The client code must concentrate on implementing  
the chemistry model in an efficient way to be able to use that client  
in both simple administration applications and highly responsive rich  
client applications.

My problem is that correctly parsing an atom feed with the focus on  
CMIS objects can be done using Stax in a very efficient way and only  
by writing several classes  (no more than 10).

So after thinking more on this I also envisaged to use only AXIOM.  
(without abdera). With only 2 extra jars I am able to efficiently load  
my feeds. But I lose the atom model.

Anyway finally I realized that by only writing a few helper classes   
over Stax I am able to do a more high level parsing in the style AXIOM  
is doing - so I adopted this extreme approach.  :)


May be my lists with pros and cons is not complete - but anyway it may  
help you in adopting a solution.
Personally, If we absolutely want abdera I will vote for solution 2.  
since 1. is not acceptable for my use cases.
If Abdera is not required then we can either use AXIOM, either  
directly use Stax API as in the current client.


Regards,
Bogdan


On 4 juin 09, at 16:25, Florent Guillaume wrote:

> On 4 Jun 2009, at 15:34, Gabriele Columbro wrote:
>> G'day Chemicals,
>> as I finally found some time to spend on the mighty Chemistry, I  
>> was able to go trough the ongoing mail threads and look a little  
>> bit better at the status of the Chemistry codebase (with an eye on  
>> which parts of Alfresco that may be suitable for contribution).
>>
>> I would like to start working a bit on the client / TCK / build  
>> automation part of the project, but, before discussing the details  
>> with you guys and get into action, I saw a couple of open mail  
>> threads (forwarded one and [4]) on a topic that can impact a lot  
>> the way I can contribute to this project:
>> I'm talking about the implementation of the AtomPub Java Client.
>>
>> As I understand Florent is working on the AtomPub Java Client and  
>> IIUC it isn't going to be based on Abdera. Though I could not find  
>> yet any code in SVN (@Florent: nor in the Nuxeo HG 'default' [1]  
>> revision, am I pointing the right one or 'integrate-atom-pub' [2]  
>> is the one to look at?),
>
> Yes the code we have is in branch integrate-atompub-client in http://hg.nuxeo.org/sandbox/chemistry/

>  -- the old repo used before switching to Apache svn.
>
> But as it happens I'm studying this code right now to adapt it to  
> the newest Chemistry API refactorings, and I'll commit code in svn  
> before tonight, although it may be nonfunctional and not very unit  
> tested at all :( This code was written in a hurry by Bogdan for a  
> customer (although we have the IP on it) and is not up to the  
> standards I expected of it, so don't hesitate to criticize it and  
> discuss refactorings.
>
>> so I was finally wondering:
>> 1__ What's the state of the art of the AtomPub Java client impl?  
>> What the dev's opinion on the usage of Abdera? Is that already been  
>> discussed and I missed it? :)
>
> No real discussion in these lists.
> After having worked with Abdera for the server part, I've come to  
> the conclusion that it's a big library, rewrapping a lot of Axiom.
> Also it's still very young, and not well designed for extensibility  
> if you stray from the simple "one feed with entries in it" model.
> Bogdan, for the client part, decided to not use Abdera because one  
> of his goals was to allow it to be a small embedded library, so StAX  
> was all that was really needed. Abdera apparently is creating lots  
> and lots of objects and use lots of memory, when a simple StAX-based  
> parser gave him huge performance boosts.
>
>> There are a couple of reasons why I ask you guys suggestions/ 
>> clarifications on this topic:
>> - Adbera is the standard Apache Atom implementation and we can rely  
>> on a good cooperation between Apache projects
>
> Agreed, however note that working with SNAPSHOTs of other projects  
> is a headache in terms of release. So if we start modifying Abdera  
> then we'll have to think about how to release.
>
>> - In terms of maintenance overhead, I see good improvements if  
>> Abdera is used both in the server (IIUC) and client part
>
> Do you see any factoring between client and server beyond the Abdera  
> extensions, beyond the few ElementWrapper subclasses?
>
> Note also that I have already started using Abdera's  
> ExtensibleElementWrapper in chemistry-atompub, however I don't  
> register them as an Abdera extension (I instantiate directly)  
> because Abdera extensions are global and I don't want to step on the  
> toes of any other code that would like to work with Chemistry but  
> already uses its own Abdera extension (like Alfresco). chemistry- 
> atompub only has the methods useful for the server though, not yet  
> the client.
>
>> - In terms of dependencies explosion, I don't see a big deal in the  
>> Abdera (client) chain of (runtime) dependencies, especially if you  
>> consider that the (Java) client is going to be most likely to be  
>> used for Java based Content Repositories (or custom applications)  
>> integration and these are typically library-flooded applications  
>> anyways.
>
> I can't disagree with the fact that projects usually already use  
> lots of libraries, so what's one more. Note however that Abdera is  
> huge, abdera-core + abdera-i18n + abdera-parser are already at 900  
> Kb (Mostly due to Unicode data in abdera-i18n by the way).
>
>> - Choosing for Abdera, may enable me to contribute the already  
>> functional Abdera extension of Alfresco, so to give quite of a jump  
>> start on the TCK/Client side
>
> That's a good point.
> BTW we also have an Abdera extension in yet another (older) CMIS  
> sandbox (http://hg.nuxeo.org/sandbox/nuxeo-cmis/file/tip/src/main/java/org/apache/abdera/ext/cmis/

> ) which could be used as well. If you contribute yours, I'll look at  
> merging useful things we may have into it (although Abdera  
> extensions are in fact rather simple).
>
>> - The usage of Abdera seems to be an enabler for contributions  
>> already built on top of it (see Sourcesense CMIS portlet [3])
>>
>> 2__ Do you think the Abdera extension could be a valid  
>> contribution? And in such a chase, would it belong to Chemistry or  
>> Abdera itself?
>
> I would leave it in Chemistry until we consider it mature enough to  
> be moved to Abdera -- barring any dependency problems. This way  
> we'll get much more rapid turnaround in its update. It could move to  
> Abdera once CMIS 1.0 is released, for instance.
>
>> As I'm not sure what the status of Florent implementation and  
>> particularly I don't want to waste any effort already done, and  
>> this is actually my first interaction with the list,
>> so please forgive me if I'm missing some blatantly obvious point ;)
>
> No problem, these are all worthwhile points.
>
> My next steps are to study Bogdan's client code, and if I (or the  
> list) feel its inadequate the I'll scrap it to go back to a simple  
> Abdera-based implementation. I'll commit something tonight in svn so  
> that others can look at it.
>
> Florent
>
> -- 
> Florent Guillaume, Head of R&D, Nuxeo
> Open Source, Java EE based, Enterprise Content Management (ECM)
> http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message