Encountered a challenging issue while processing a large XML file on the client side. Some unicode characters are being replaced with unreadable sequences, causing the server to be unable to parse the XML properly. Here is how I approached testing and handling this problem:
var text = new XMLSerializer().serializeToString(xmlNode);
console.log(text);
var req = new XMLHttpRequest();
req.open('POST', config.saveUrl, true);
req.overrideMimeType("application/xml; charset=UTF-8");
req.send(text);
Even though logging displays the correct string:
<Language Self="Language/$ID/Czech" Name="$ID/Czech" SingleQuotes="‚‘" DoubleQuotes="„“" PrimaryLanguageName="$ID/Czech" SublanguageName="$ID/" Id="266" HyphenationVendor="Hunspell" SpellingVendor="Hunspell" />
Upon inspection in the request (Chrome dev tools) and at the server side, it seems that the string has been altered:
<Language Self="Language/$ID/Czech" Name="$ID/Czech" SingleQuotes="‚‘" DoubleQuotes="„“" PrimaryLanguageName="$ID/Czech" SublanguageName="$ID/" Id="266" HyphenationVendor="Hunspell" SpellingVendor="Hunspell" />
The original encoding of the XML file is UTF-8 as well. The same issue persists when utilizing jQuery for this task.