xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Clark <an...@apache.org>
Subject [Sample] Parsing XML Documents on Socket
Date Mon, 04 Dec 2000 04:37:51 GMT
We have another posting about how to read XML documents on a
socket connection. Instead of answering the question, I've
done one better by writing a new sample that shows you how
to read XML documents from a socket connection!

The solution to reading an XML document on a socket is to wrap 
the input and output with a protocol. This enables the parser 
to parse a document on the socket stream and detect the end of 
the document and (even more important) not close the socket 
connection by closing the input stream.

My sample code works regardless of the encoding of the XML
document and is general enough that it can be used to handle
any variable length data being transferred on a socket. This
is basically how it works: the server "wraps" the output 
stream when sending the XML document and the client "unwraps" 
the input stream when receiving the XML document. I'll detail 
exactly how this works below but what's important is that it 
works transparently to the server and client as long as they 
use the appropriate "wrapper" classes for the input/output 
streams.

The wrapper input/output streams introduce a "packet" kind 
of protocol onto the stream. Therefore, when the server
writes data to the output stream, the wrapper class breaks
the input into a series of packets. These packets contain
a simple header that just states how many bytes are in the
packet (not including the header), followed by the packet
data. The receiving input stream knows how to read the
header and return only the bytes in the packet data to the
calling client code.

These input/output classes provide a general mechanism for
sending variable length data on a socket connection. It
acts as if there is a localized input/output stream within
the socket stream. And the wrapper classes can be used
independently from the socket sample. There are a few
caveats, though: 1) the server code MUST close the wrapper
output stream; 2) the client code MUST close the wrapper
input stream. The second requirement is only needed if you
detect a parse error and need to skip to the end of the
wrapper input stream to continue processing the next
piece of information.

I added the sample to the Xerces2 codebase with the
assumption that we'll be moving that code over soon and it
can find a permanent home. If you would like to check it
out now, here's how you do it (this will only checkout the
socket samples dir from CVS):

  set CVSROOT=:pserver:anoncvs@xml.apache.org:/home/cvspublic
  cvs login        (password: anoncvs)
  cvs checkout -d socket -r xerces_j_2 xml-xerces/java/samples/socket

This will create a directory called "socket" which contains
the sample "socket.KeepSocketOpen". You can read the javadoc
for information about how to use this sample. Or, you can
just use the wrapper streams independently. They're in the
"socket.io" package and are called "WrappedOutputStream"
and "WrappedInputStream", respectively.

We'll have to put some explanation in the actual Xerces
documentation so that people know about this sample and
how to solve the XML-on-a-socket problem, though. Any
volunteers?

Let me know if this sample helps.

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org

Mime
View raw message