As I work on my JavaScript code to process website content, I've come across a frustrating issue with the SharePoint text editor. It has a tendency to insert a "zero width space" character (Unicode value 8203 or B200 in hexadecimal) when the user hits backspace. Despite my attempts to remove it using the default "replace" function, I have not been successful. Various variations of the code such as:
var a = "om"; //the invisible character is between o and m
var b = a.replace(/\u8203/g,'');
= a.replace(/\uB200/g,'');
= a.replace("\\uB200",'');
have all failed to eliminate the unwanted character. Strangely, typing the actual character in the expression seems to be the only effective method:
var b = a.replace("",''); //it's there, believe me
This approach comes with its own set of challenges since the character is invisible, making the code line somewhat perplexing. Moreover, if the file encoding changes or is deployed to SharePoint where encoding may differ, this solution may cease to work. Is there a way to address this using Unicode notation instead of relying on the character itself?
[Musings about the zero-width space]
If you're unfamiliar with this elusive character (which most likely is the case, given its invisibility unless it causes issues in your code), it can wreak havoc on certain pattern matching functions. Here's what it looks like when caged:
[] <- careful, don't let it escape.
To view it, copy these brackets into a text editor and move your cursor over them. You'll notice that it takes three steps to cover what appears to be two characters, with a skipped step in between.