xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Clark <an...@apache.org>
Subject Re: DocumentBuilder.parse()
Date Fri, 01 Dec 2000 05:00:27 GMT
Ahmad Morad wrote:
> how can I avoid the parser from closing the stream after the 
> parse process has been finished ?

This is a bug. Could someone please look into fixing this?

You can avoid it by ignoring the call to close the stream. For

  public class IgnoreCloseInputStream extends FilterInputStream {
    public IgnoreCloseInputStream(InputStream stream) {
    public void close() throws IOException {
      // do nothing

However, you're still going to run into a problem in the 
current version of the parser because it wants to read an
entire block of bytes from the file (16K, I believe)
before parsing anything. In a situation with a socket, 
this is troublesome.

The new parser (Xerces2) will parse as much as it can no
matter how many bytes were returned by the underlying
input stream. But this still doesn't solve the problem
with transmitting multiple pieces of information (could
be multiple XML documents or perhaps just any information
after the end of the XML document on the stream) -- the
problem is that the parser will try to buffer past the
end of the XML file. Not buffering is way to expensive
so it's not a really good solution to the problem.

The solution is to wrap the output and input streams to
specify some kind of protocol so that the input stream
being read by the parser signals the end of the XML file
as an end of the stream (even though the stream really
hasn't ended!). Does this make sense?

I have a good idea for a sample to show one way of doing
this that is irregardless of the XML document's character
encoding, but I just haven't had the time. A lot of
solutions offered by people assume a certain character
encoding so that they can embed a special character or
token that is read by their special input stream in order
to signal the end of the XML file.

When I get the time, I'll write a sample for this because
it is a very common problem.

Andy Clark * IBM, TRL - Japan * andyc@apache.org

View raw message