Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm
Subject: RE: XSLT API Proposal
To: Kay Michael <Michael.Kay@icl.com>
Cc: cocoon-dev@xml.apache.org, xalan-dev@xml.apache.org,
        James Clark <jjc@jclark.com>, Steve Muench <smuench@us.oracle.com>,
        Adam Winer <awiner@us.oracle.com>, Assaf Arkin <arkin@exoffice.com>
From: "Scott Boag/CAM/Lotus" <Scott_Boag@lotus.com>
Date: Sat, 5 Feb 2000 01:11:26 -0500
Message-ID: <OFE4111502.FAC4EA67-ON8525687C.001BFAFF@lotus.com>
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii


Kay Michael <Michael.Kay@icl.com> wrote:
> There's the wretched issue that a SAX DocumentHandler doesn't handle
> comments. I don't think we can dodge this one, if comments are lost then
the
> processor isn't conformant.

Yeah, comments and some other stuff.  But the SAX2 has interfaces for
these, but they are broken down into seperate interfaces (though I haven't
had a chance to look at the final release).  My take was that the processor
can try to cast the DocumentHandler object to these interfaces, which
either works or doesn't.

The DocumentHandler object is a standard way through the XSLT interface.
The processor might also try and see if it is an object it knows about, so
it can call a flag for disabling output escaping, or the like.

> or we can
> try and do it properly using SAX2 interfaces.

Yep, the only thing is, I don't really want to force the caller to pass in
all the interfaces as seperate objects, hense my settlement on just passing
in the DocumentHandler and assuming that the processor could try and cast
it to LexicalHandler and the like.

> Another problem with DocumentHandler is that it isn't namespace-aware

Again, SAX2 is.

> then it may need to know information
> from xsl:output in order to do its work.

Yes, absolutely.  I realized I hadn't addressed xsl:output just after I
sent the note.  But, I'm not quite sure what to do about it.  We need a
class like Assaf Arkin's OutputFormat class (see the Xerces Serialize
classes), but I don't want to make a class that's redundent to his, but the
interfaces obviously shouldn't be dependent on Xerces either.  Assaf, do
you have any ideas on this?

> There's generally a question
> about how much the API should be able to do to set output properties that
> could be set from xsl:output: if encoding, why not method and indent?

Again, if we had something like Assaf's OutputHandler class, it would solve
this.

>There are some interesting little questions about the semantics of
> strip-space when a DOM document is supplied: is it changed in situ?

I wouldn't say so.  I think the input DOM should always be immutable.  I
should say this on the API, if people agree.

> I think we should say
> that the Node supplied must be a document or an element.

Hmm... for the input node?  Passing in DocumentFragments can be useful, as
well as text nodes, or even attributes, though I admit these are fringe
cases.  I would probably be happy enough to limit, if I can understand the
justification.  For the result target, this is clearly a fair constraint.

> I'm not sure about writing the result to a Node. I can see a requirement
to
> present the result as a DOM Document, but attaching it to a Node of an
> existing Document seems a bit obscure.

I have a fair number of users of Xalan doing this... people use XSLT in
some pretty obscure ways.

> What about DOM2? Again, DOM is not namespace-aware, and doesn't give you
> information which you need for conformance, such as IDs.

As with SAX, I think it's up to the processor to use DOM2 behind the
scenes.  I don't think that DOM2-specific interfaces should be passed via
the API, at least for this round.

> What about DOM2? Again, DOM is not namespace-aware, and doesn't give you
> information which you need for conformance, such as IDs. (Or am I wrong?
I'm
> not a DOM expert)

Not sure, I would have to check.  I thought it did.  Again, the processor
can always try to cast to a familiar DOM implementation if this is not so,
otherwise, what's to be done?

> we might want to abstract
> away from this assumption by doing
> XSLTResultTarget.setNextTransform(transform).

I don't have any deep opinions on this.  I guess it seems like a feature we
could survive without.  Do people have opinions?

> Why is setParameter() defined on XSLTProcessor, shouldn't it be on
> Transform?

No, Transform needs to be threadsafe, each instance running concurrently in
multiple threads.  The XSLTProcessor is equivalent to the session object,
so that is where it should be set, in my opinion.

> The
> arguments need some thought too: for the name, is there a need to supply
a
> namespace URI;

Good point.  Probably.  This should be a seperate parameter, and should be
the full URI.  I assume I should do another signature, or should I just add
a argument that can be null?

> and what types of Object can be supplied?

Xalan has you pass a XObject in, that can be a XNumber, XNodeSet, XBoolean,
etc., in addition to any java type.  I left that out, because it
complicates things, and I don't think it is strictly neccessary.  In any
case, you should be able to pass any Java object in as a parameter, and it
should be up to the processor to figure out the conversion rules.  We could
set up some general conversion rules, if you like.

Xalan also has a way to pass in an expression to be evaluated.  Should I
have a setParamExpression method?  I left it out for the sake of
minimalism, but would be glad enough to add it.

> It would be a useful convenience to allow setFile() on XSLTInputSource.
> Converting a File name to a URL is dead easy in Java 1.2 but a lot of
people
> want to stay compatible with 1.1, where it is hard work.

Yeah, I have a lot of ugly code to either understand a URI or filename
interchangeably.  It seems to me you have to do this anyway for xsl:include
and the like, since a lot of people want to use full paths for filenames in
the stylesheet  (I'm not sure I fully understand the rules for system IDs
in XML).  It seems to me we shouldn't change the SAX conventions, but I'm
easy if people think we should have a setFile method.

> Should we allow the user to specify some kind of URIResolver for turning
the
> URIs used in document() and xsl:import etc into an XSLTInputSource?

That sounds good to me.

> In general I think names should not be prefixed XSLT. I agree that
creates
> potential confusion in the case of InputSource and Exception, so perhaps
> these are exceptions!

Hmm... It seems to me that XSLTProcessor is more readable in people's code
that Processor.  XSLTException needs it, I think.  XSLTInputSource needs
it, as you say.  XSLTResultTarget could probably be renamed to
ResultTarget.  The proposed XSLTDocument needs it also.  I think whenever
the design pattern is generic, like Processor, Exception, Document, and
InputSource, the prefix is good and reasonable.  In the cases where the
design pattern is specific to the core of what XSLT does, like Transform
and maybe ResultTarget, you don't need the prefix.

BTW, I wouldn't mind renaming XSLTProcessor to XSLTExecContext.  What do
you think?

> We should think about extensibility. A SAX2-like mechanism to get and set
> general properties might save us embarassment in the future.

Yes, I agree.  I'll take a look at the SAX2 mechanism and steal it .  : - )

Mike, thanks so much for your great comments.  As always, it is a pleasure
trading notes with you.

-scott