opposite world "XML Parsing Error: not well-formed" error



I know what "XML Parsing Error: not well-formed" means broadly. Somehow the text does not comply with the xml specification. This normally would mean that there are unmatched tags or perhaps an incorrectly written header.


However, there is also the character encoding type of not well formed documents. I'm getting results that seem opposite of what I would expect.


When I make a rest call to a java rest service from a browser on my windows 7 machine to the tomcat instance on my windows 7 machine, I get back an xml document that contains the following word as text like so:



<foo>RÃœCK</foo>


I know that's what I get because I used curl to save the results and that's exactly what's in the document. However, when viewed in firefox, ie8, or chrome, the "Ü" part of the text actually displays as a U with 2 dots above it. And, none of the browsers complains about the document being not well-formed.


Then I make a call to the same rest service except I make it from my windows 7 machine to a linux machine running tomcat. What I get is:



<foo>RÜCK</foo>


That's what I see when I use curl to download the results. However, both firefox and ie complain that the xml document is not well-formed!


I know that somehow when I copy paste "Ü" it changes from being a single character to being two characters due to document encoding or something. But, here is the next confusing thing.


When I update things in the db to store "RÃœCK" as the copy pasted value, it displays as "RÃœCK" when sent from tomcat on windows, but when sent from tomcat on linux it's giving a not well formed error! Why?


Can anyone explain what exactly is causing the windows and linux systems to display the same data differently and why it's not well formed from the linux tomcat server but it is well formed from the windows 7 tomcat server?


No comments:

Post a Comment