abdera-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James M Snell <jasn...@gmail.com>
Subject Re: Graceful handling of non-atom 1.0 feeds
Date Sun, 11 Jun 2006 21:05:43 GMT
There's actually a very practical use for this arbitrary XML parsing
mechanism that's already in the code.  When you call
content.setValue(...) on an atom:content with XML content, you can pass
in an XML string.  The parser will parse it to create the appropriate
ExtensionElement object to set as the child of the content object. This
also works when setting XHTML content on the Content and Text objects,
making it very simple for us to construct XML and XHTML nodes.

The API is something like...

  entry.setContentAsXml("<a><b><c/></b></a>", baseUri);

Regarding spec compliance, I had been kicking around the idea of a
Validator.INSTANCE.validate(...) mechanism.  You could pass in any of
the FOM objects and have it validate against the spec.  This would also
allow us to configure validators of various strengths and purposes (e.g.
StrictValidator, LiberalValidator, UlterliberalValidator,
MyGdataValidator, Atom03Validator, etc).  It also provides a clean
separation between the parser and the validator.

- James

Elias Torres wrote:
> On 6/10/06, James M Snell <jasnell@gmail.com> wrote:
>> Abdera will successfully parse any well-formed XML.  The trick is not to
>> use generics when parsing.
> [snip]
>> The parser is currently very liberal.  It will make sure that Atom Date
>> Constructs are at least in iso8601 format and will validate URI's, but
>> everything else is left wide open.  The absolute minimum it requires is
>> well-formed XML.  A broad spectrum of Atom spec violations are allowed.
>> We don't attempt to correct any of those errors, however.  For example,
>> if someone puts escaped HTML markup in a text construct that is marked
>> as text, Abdera will represent that data as plain text.
>> - James
> I'm sort of in the middle on this. If our main goal is to create a
> fully-compliant Atom parser/protocol implementation, why should we
> parsing any feed or XML-document out there? Well, the answer is
> because the world is not perfect as Paul mentioned already. But then
> if we are a liberal parser then I'm afraid we'd become something like
> Universal Feed Parser.
> If anything I propose we add "modes" for parsing in which we can throw
> exceptions and warnings if we see something suspiciously
> non-compliant. We could have strict, liberal, middle-of-the-road modes
> :). This way we serve the community with a well-define role: a
> validating Atom parser/protocol implementation. Maybe something that
> will allow anyone to host their "atom validator" for their site/app
> inside or outside of their company.
> Just thinking outloud. However, it seems that a decision on the matter
> should be central to our project.
> -Elias

View raw message