tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: Tomcat strips CRLFs from the generated page
Date Tue, 14 Jan 2014 10:00:09 GMT

1) on this list, do not "top post". Read the rules :
"tomcat-users" .. Important .. 6.

Asok Chattopadhyay wrote:
> As I said before, I have no control over the input text. In the test
> servlet I am simply reading text from a file and sending it out to the
> browser. No other processing has been done to the text by the servlet. The
> browser, however, receives a page with CRLF stripped starting from a
> certain point in the text.
> If I View source in the browser, I can see that happening.
> This is consistent over most operating systems (Windows and Linux) and most
> browsers (IE, Firefox and Chrome) and the stripping happens exactly at the
> same point onward, in all combinations of OS and browser.
> My question is: who is stripping the CRLF from the text? Is it Tomcat or
> the browser? Is Tomcat doing any validation of the text before sending it
> out to the client?

2) It would be good if you listened to what you have been told so far :
The original HTML document which you have indicated to us 
is a mess :
- it contains lines terminated by single LF, and other lines terminated by CR/LF.  This is

true for the whole document, not just starting at line 153.
You have to fix this, before any further analysis can possibly be done.
If you have no control over that content, then you need to ask whoever has control over 
it, to fix it.
- it contains multiple HTML syntax errors.  Same thing.
Taking into account the versions and patches, there are probably several hundred different

versions of browsers on the market right now.  While many of them will do a reasonable and

consistent job handling correct HTML, each of these versions will probably handle 
incorrect HTML in a slightly different way (even for the "view page source" 
functionality).  The fact is, what you see in the browser's "view page source" may or may

not be the original content.  There is no way to tell, really.
- by default, Tomcat will NOT change anything in the content of a page or file that it 
serves.  If the page is served by the default servlet (as it happens supposedly when using

the link, then it will be sent to the 
browser as it is on disk.
- your webapp *might* change the content, due to how it is reading and/or writing that 
content.  I don't know, and maybe someone else will have the time to analyse your servlet

in detail.  But anyway, the original content that your servlet is reading is garbage, so
if it does not change anything, it will write the same garbage to the browser.  If you 
want to send out correct HTML, you either have to fix the original file first, or else do

the fixing in your servlet.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message