abdera-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Berry <chriswbe...@gmail.com>
Subject atomserver.org
Date Fri, 30 May 2008 15:13:30 GMT
Greetings,

As some of you are aware, over the past year we have developed a  
generic Atompub data services layer on top of Abdera. Our employer has  
now graciously allowed us to release it as an open-source project  -  
available at http://www.atomserver.org. We were really hoping to get  
this out before the recent discussion on server refactoring, so we  
could join in and just point everyone at the actual code, but the  
timing did not work out, so please bear with this long-winded email.   
It appears that there is really not too large a difference between our  
code structure and the current server refactoring, although the names  
of our Objects are, of course, different, and at least arguably more  
clear for the end user.

Some of the primary tenents of our design include:
An Object Model which reflects the Atom "data model". Atom has the  
concepts of Service, Workspace, Collection, and Entry, where each of  
these has an implied interface. A Service contains a Set of one or  
more Workspaces and delegates operations to them.  A Workspace has a  
Set of zero or more Collections and further delegates operations to  
them. Collections are a Set of zero or more Entries, which are  
represented as both generic metadata (e.g. updated, published, id,  
etc.) and some Entry-specific Content, which is, essentially,  
undefined. It is important to note that an object model which reflects  
this hierarchy is conceptually natural to the end user.
Content is the real "touch point" with the client. Almost every other  
aspect of the Atom Server can be handled for them, so that each client  
does not have to reinvent the wheel. The system contains numerous  
extension points when the defaults do not meet the needs of the client.
The design must support "configuration points." We call these  
WorkspaceOptions and CollectionOptions. These are consumed to create  
their real world counterparts. Configuring Workspaces and Collections  
directly is problematic, quite simply because we cannot be omniscient  
of all possible client options. Providing a level of indirection here  
yields a significantly more flexible design, and fits better with IOC  
systems. In other words, the end user configures WorkspaceOptions  
which are consumed to produce Workspaces, etc.
As said, we have built a relatively full featured Atom Server. It is  
in Production, and has been battle-hardened with a ton of traffic (it  
is currently servicing ~1M hits/day). Our Atom Server acts as a data  
bus between our many disparate sites, allowing them to easily share  
information. I emphasize this to stress the point that the model  
presented below is the result of many iterations with several real  
world clients.

The approach we have taken is to clearly separate the entry content  
from the Atom metadata itself, and to provide an off-the-shelf  
solution. Our AtomServer takes on the responsibility of managing all  
the Atom metadata; Workspaces, Collections, and Entries. The client is  
responsible only for storing their content. And in general, we've also  
made even this transparent to them, since mostly they are storing XML  
- either as Files or as CLOBs in a DB. So, for most purposes, creating  
an AtomServer is simply an exercise in configuration (i.e. Spring).  
The Atom metadata is managed in a relational database -  using iBatis  
to yield database neutrality. We currently support HSQLDB, Postgresql,  
and SQLServer, and support for other databases is planned. Our  
database interaction (our schema and the queries we run) has gone  
through several iterations, it guarantees transactional correctness,  
and can handle handle high loads on large datasets.

Our AtomServer implements the AtomPub spec (using Abdera, hence the  
post here), and has extensions that are largely based on GData,  
although we have made a few small usability tweaks based on client  
feedback. The current feature set includes, among others:
Support for optimistic concurrency, with overrides for single-writer  
schemes
Full support for consistent paging
Full Category support, including support for complex boolean Category  
queries
"Auto-tagging" support. Workspaces and/or Collections can be  
configured such that Categories are automatically created when Entries  
are created or updated.  When the content is XML, this can be done  
using XPath
Plug-gable Content validation support, with an emphasis on Relax NG  
(XSD support is planned)
Plug-gable Content storage
Aggregate support. Workspaces and/or Collections can be configured  
such that aggregates of other Entries are directly addressable
Full support for Batch operations
Optional Locale sensitivity for Entries

Digging in to the design:

The top-level class AtomServer, as you would expect, extends Abdera's  
AbstractProvider and is configured (from Spring or some other IOC)  
with an AtomService. The AtomServer delegates its operations to an  
AtomSevice, which, in turn, may delegate its operations to its  
subordinate AtomCollections, and so on...  Note that a couple of the  
interfaces presented herein have been idealized to their intended  
form, to which are transitioning.

public class AtomServer extends org.abdera.AbstractProvider {
     public void setAtomService(AtomService atomService) {}

     public ResponseContext getService(RequestContext request) {}
     public ResponseContext getFeed(RequestContext request) {}
     public ResponseContext getEntry(RequestContext request) {}
     public ResponseContext createEntry(RequestContext request) {}
     public ResponseContext deleteEntry(RequestContext request) {}
     public ResponseContext updateEntry(RequestContext request) {}

     // other methods will be implemented later, such as; getMedia(),  
etc.
}

The AtomService interface;

public interface class AtomService {
     void setWorkspaces(java.util.Set<WorkspaceOptions>  
workspaceOptionsSet );
     AtomWorkspace getAtomWorkspace(String workspace);

     java.util.Collection<String> listWorkspaceNames(RequestContext  
request);
     java.util.Collection<Workspace> listWorkspaces(RequestContext  
request);

     URIHandler getURIHandler();
     void setUriHandler(URIHandler uriHandler);
     void verifyURIMatchesStorage(String workspace, String collection,  
IRI iri, boolean checkIfCollectionExists);
}

The URIHandler interface is as follows. As mentioned in several recent  
email threads, these Resolvers are somewhat challenging. We created  
our own URIHandler because the URL structure; {workspace}/{collection}/ 
{entry}/{revision} is relatively standard. And we were having to  
essentially define the URL structure in two locations; the Spring  
config which had to agree with the code which consumed URLs. The  
implication of this was that, ultimately, we were parsing the URL twice.

public interface URIHandler  implements Resolver<Target> {
     public void setRootPath(String rootPath) ;
     public void setContextPath(String contextPath);
     public String constructURIString(String workspace, String  
collection, String entryId, Locale locale, int revision) ;
     public String getServiceBaseUri() ;

     public EntryTarget getEntryTarget(Request request) ;
     public FeedTarget getFeedTarget(Request request) ;
     public ServiceTarget getServiceTarget(Request request) ;

     public Target resolve(Request request) {
     public URITarget parseIRI(RequestContext requestContext, IRI iri) ;
}

Omitting details, we have the following bean-type classes:

public abstract class URITarget extends AbstractTarget {}
public class ServiceTarget extends URITarget implements  
ServiceDescriptor {}
public class FeedTarget extends URITarget implements FeedDescriptor {}
public class EntryTarget extends URITarget implements EntryDescriptor {}

And the Descriptors are defined as;

public interface ServiceDescriptor {
     String getWorkspace();
}

public interface FeedDescriptor {
     String getWorkspace();
     String getCollection();
}

public interface EntryDescriptor {
     String getWorkspace();
     String getCollection();
     String getEntryId();
     Locale getLocale();
     int getRevision();
}

And then, back to the chain of command, the AtomWorkspace:

public interface AtomWorkspace {
    AtomService getParentAtomService();
    String getName();

    AtomCollection getAtomCollection(String collectionName);
    void setCollections(java.util.Set<CollectionOptions>  
collectionOptionsSet );

    boolean collectionExists( String collectionName );
    java.util.Collection<Collection> listCollections( RequestContext  
request );
    java.util.Collection<String> listCollectionNames( RequestContext  
request );

    WorkspaceOptions getOptions();
    void setOptions( WorkspaceOptions options );

    void bootstrap();
}

And the AtomCollection:

public interface AtomCollection {
     AtomWorkspace getParentAtomWorkspace();
     String getName();

     Entry getEntry(RequestContext request);
     Feed getEntries(RequestContext request);
     UpdateCreateOrDeleteEntry.CreateOrUpdateEntry  
updateEntry(RequestContext request);
     java.util.Collection<UpdateCreateOrDeleteEntry>  
updateEntries(RequestContext request) ;
     Entry deleteEntry(RequestContext request);

     ContentStorage getContentStorage();
     ContentValidator getContentValidator();
     CategoriesHandler getCategoriesHandler();
     EntryAutoTagger getAutoTagger();

     CollectionOptions getOptions();
     void setOptions(CollectionOptions options);

     java.util.Collection<Category> listCategories(RequestContext  
request);
     void ensureCollectionExists(String collectionName);
}

And in AtomCollection we delegate to several interfaces:

public interface ContentStorage {
     String getContent( EntryDescriptor descriptor );
     void putContent( String contentXml, EntryDescriptor descriptor );
     void deleteContent( String deletedContentXml, EntryDescriptor  
descriptor  );
     void obliterateContent( EntryDescriptor descriptor );

     void initializeWorkspace(String workspace);
     void testAvailability();
     boolean canRead();
     boolean contentExists(EntryDescriptor descriptor);
}

public interface ContentValidator {
     void validate(String content) throws BadContentException;
}

public interface CategoriesHandler {
     List<Category> listCategories( FeedDescriptor descriptor );
}

public interface EntryAutoTagger {
     void tag(EntryDescriptor entry, String content);
}

And finally, WorkspaceOptions and CollectionOptions could be tag  
interfaces at the top-level. Although in our current implementation  
they are concrete classes.

Our Spring configuration file looks something like this:

    <bean id="uriHandler" class="org.atomserver.uri.URIHandler">
         <property name="rootPath" value="foo"/>
         <property name="contextPath" value=bar"/>
     </bean>

    <a:serviceContext>
         <a:provider>
             <ref bean="provider"/>
         </a:provider>
         <a:targetResolver>
             <ref bean="uriHandler"/>
         </a:targetResolver>
     </a:serviceContext>

    <bean name="provider" class="org.atomserver.AtomServer">
         <property name="atomService" ref="service"/>
     </bean>

    <bean name="simpleValidator"  
class="org.atomserver.core.validators.SimpleXMLContentValidator"/>

    <bean name="service"    
class="org.atomserver.core.dbstore.DBBasedAtomService"  init- 
method="initialize">
         <property name="uriHandler" ref="uriHandler"/>
         <property name="workspaces">
             <set>
                 <bean class="org.atomserver.core.WorkspaceOptions">
                     <property name="name" value="widgets"/>
                     <property name="IsLocalized" value="true"/>
                     <property name="defaultContentStorage"  
ref="fileBasedContentStorage"/>
                     <property name="defaultContentValidator"  
ref="simpleValidator"/>
                     <property name="defaultCategoriesHandler"  
ref="entryCategoriesHandler"/>
                 </bean>
                 ......

It should be noted that we are in the process of greatly simplifying  
our Spring configuration, such that defaults are preset and then  
overridden, and extended Spring elements are available. When this work  
is complete, you will be able to configure an AtomServer as simply as  
this, where all other wiring can be as transparent as you desire:

<util:set id="org.atomserver-workspaces">
       <as:workspace name="widgets" />
       .....

atomserver.org is available on CodeHaus, and everyone on this list is  
encouraged to go download the code and check it out.  We'd welcome any  
feedback you have - and since some of what has been going on in Abdera  
proper overlaps with functionality that AtomServer has, we'd love to  
explore the best way to bridge that gap.  AtomServer started life as a  
proprietary service to solve a particular business need - but as it  
evolved we quickly saw that it had more general applications, and we  
are thankful that our employer was insightful enough to allow us to  
share it.  We hope that the "battle-hardening" that the server has  
gone through by subjecting to large, real-world traffic can help  
anyone who needs a solution like this be up and running with minimum  
effort.

We look forward to your feedback!

Cheers,
-- Chris & Bryon


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message