synapse-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kim Horn" <kim.h...@icsglobal.net>
Subject RE: VFS Text Files with spaces don't work.
Date Fri, 03 Apr 2009 22:04:10 GMT
In theory you are correct but we have been a Web Service B2B shop for about 10 years and nothing
can be assumed to work according to standards. We know that unless such elements are wrapped
in CDATA the white space will get removed somewhere. Opinions on the net:

"Now when the XML specification says any white space, they don’t really mean it. HA!
The standards leave some aspects of white space handling up to the implementers, or at least
that’s what the implementers would have us believe. I suspect some implementers choose
to ignore parts of the standards they don’t like or can’t accommodate easily in
their toolsets. It’s inevitable that different XML parsers make different interpretations
of the standards. This leads to some fuzzy behavior where white space is concerned."

another openxml rfp:
"White space handling is an unresolved issue in the present definition of XML parsers, falling
outside the scope of both the DOM specification and the SAX API."

In our case when we send data on to customer who knows what parser/technology gets used. How
many other routers, mediators, proxies, mess up the XML; using technology from 1980's.
So, in practise, we wrap all text data with leading/trailing spaces with CDATA "" "". At the
moment for these small files, using this data format standard, I am using a JS mediator to
take the text payload and insert it into destination XML. This means using CDATA to wrap the
JS script.. but then I can't add in another nested CDATA in the JS script XML to wrap the
data. So I will have to move this to XSLT or most likely Java. Unless you know a way around
this.

Unfortunately due to size/volume of files we cannot log them directly in synapse. Due to privacy
laws in US I cannot see the production data directly. so it takes a while to debug these issues.
 

I trust you, that you believe, it cannot be happening inside Synapse/Axis and that really
helps rule out where this is happening. But we are loosing the spaces; and the first place
to rule out when tracing the files through the process was Synapse at VFS. Given these problems,
wrapping the data at the start of the process with CDATA, although a paranoid approach, would
mean it is then safe all the way through. 

Will take a few days to debug this here; tell you what we find.


Thanks
Kim






-----Original Message-----
From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
Sent: Fri 03/04/2009 19:00
To: dev@synapse.apache.org
Subject: Re: VFS Text Files with spaces don't work.
 
This is not very convincing for several reasons:

* An XML parser never removes whitespace.
* A validating XML parser reports whitespace _between elements_ in a
special way, but it is up to the application to decide what to do with
it. Note that we are not talking about this type of whitespace here.
* XSLT is not schema aware.
* While space="preserve" is defined in the XML specs it has no well
defined semantics and it is up to the application to interpret it.
* Axis2 and Synapse don't (or at least shouldn't) remove any whitespace.

What you really need to do is to determine at what step in the
mediation the whitespace is lost. Then we can try to understand why
this is so.

Andreas

On Fri, Apr 3, 2009 at 07:19, kimhorn <kim.horn@icsglobal.net> wrote:
>
> You are probably right. Haven't had time to look but I hope the payload
> "text" element is defined as space="preserve". If yes then its OK, If not
> then ?
>
> Any idea where is the �XSD is easily available ?
>
> It is probably one of the mediators removing the white space along the way;
> as XML does not preserve this. I will have to add in yet again more java to
> wrap the text field in CDATA "". As the recipient cannot change their XSD,
> this looks like the only option....Not sure XSLT will work unless target
> name space also defines the element as space preserve ?
>
> My simple Synapse script is becoming a massive Java program. And I thought
> "wouldn't it be easy
> to use a scripting tool like Synapse compared to writing Java code ". How
> wrong.
>
> Thanks
> Kim
>
>
>
>
> Andreas Veithen-2 wrote:
>>
>> Are you sure that these spaces get trimmed inside the VFS transport
>> and not somewhere in your mediation? Normally the plain text message
>> builder is designed to strictly preserve the file content (including
>> spaces), so this would be a serious bug.
>>
>> Andreas
>>
>> On Thu, Apr 2, 2009 at 08:52, kimhorn <kim.horn@icsglobal.net> wrote:
>>>
>>> Run into a problem with VFS reading text files with fixed field length
>>> fields, where empty fields are padded with spaces. There are a number of
>>> B2B
>>> formats that do this.
>>>
>>> If the empty fields are at the start or end of the file then when these
>>> are
>>> inserted into XML as Payload the
>>> XML removes the spaces. The text should be wrapped in CDATA with double
>>> Quotes to preserve this space data; but VFS does not do this. So the
>>> fields
>>> at start or end of file get lost and hence the whole file is now garbage.
>>>
>>> Hopefully reading them as binary files (not plan text) will get over this
>>> ?
>>> Other ideas ?
>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/VFS-Text-Files-with-spaces-don%27t-work.-tp22841970p22841970.html
>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/VFS-Text-Files-with-spaces-don%27t-work.-tp22841970p22862146.html
> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org



Mime
View raw message