Or, if you feel you have to store the file in memory before parsing, then
at least store it in a ByteArray and not a CharArray. Then you can feed
the parser with an InputStream on the ByteArray, and avoid the encoding
and byte-order problems that Marshall describes.
On Sun, 22 Aug 2010 13:48:39 -0700, Marshall Schor <msa@schor.com> wrote:
> I'm not an expert here, but I found by googling that at least one
> person thinks
> it's a bad practice to read things into char arrays, and then send those
> to an
> XML parser.
>
> The web page http://www.odi.ch/prog/design/newbies.php#7 says:
>
> It is a very bad idea to read an XML file and store it in a String. An
> XML
> specifies its encoding in the XML header. But when reading a file you
> have to
> know the encoding beforehand! Also storing an XML file in a String wastes
> memory. All XML parsers accept an InputStream as a parsing source and
> they
> figure out the encoding themselves correctly. So you can feed them an
> InputStream instead of storing the whole file in memory temporarily. The
> byte
> order (big-endian, little-endian) is another trap when a multi-byte
> encoding
> (such as UTF-8) is used. XML files may carry a byte order mark at the
> beginning
> that specifies the byte order. XML parsers handle them correctly.
>
> -Marshall
>
> On 8/22/2010 8:52 AM, John Wiesel wrote:
>> Dear all,
>>
>> I am currently stalled in my project by XmiCasDeserializer.deserialize:
>> I
>> am wondering why there is no method that allows to directly set up the
>> XML
>> parser with a InputSource instead of an InputStream. I would like to
>> load
>> my CAS from an XMI file that I have cached in a CharArray. As I cannot
>> generate an InputStream from a String (StringBufferInputStream is
>> deprecated since JDK 1.1) but should be able to do so using an
>> InputSource
>> w/o much trouble, I hope there is a sensible solution for this that I
>> just
>> haven't thought of yet.
>>
>> Any suggestions?
>> Thanks folks.
>>
>> John
>>
>>
--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
|