As previously stated, you are encountering two distinct escape sequences in this situation:
\n
represents a newline character as Unicode Character 'LINE FEED (LF)' (U+000A)
\\
signifies the backslash as Unicode Character 'REVERSE SOLIDUS' (U+005C)
While these escape sequences consist of two characters in source code, they represent only one character in memory.
Consider the following demonstration:
const toEscaped = s => s.toSource().match(/"(.*)"/)[0];
['\n', '\\n', '\\\n', '\\\\n', '\\\\\n']
.forEach(s => console.log(`There are ${s.length} character(s) in ${toEscaped(s)}`))
This also applies in regular expressions. The \n
is counted as one character, causing the lookahead (?=.{2})
to capture the preceding \
as well.
It seems from your comments that there may be issues with incorrect encodings. For instance, when a user inputs foo\nbar
, it could unintentionally be interpreted as "foo\\nbar"
instead of "foo\nbar"
. In such cases, the goal is not to remove \
characters but to convert \
+ n
to \n
.
The code snippet below demonstrates how to handle escape sequence substitutions for \\
and \n
:
const toEscaped = s => s.toSource().match(/"(.*)"/)[0];
const toHex = s => Array.from(s).map((_, i) => s.charCodeAt(i).toString(16).padStart(2, '0')).join('+');
['\n', '\\n', '\\\n', '\\\\n', '\\\\\n']
.map(s => ({ a: s, b: s.replace(/\\n/g, '\n').replace(/\\\\/g, '\\') }))
.forEach(({a, b}) => console.log(`${toEscaped(a)} --> ${toHex(b)}`))
To both replace "\\n"
with "\n"
and eliminate preceding "\\" characters, consider the following code snippet:
const toEscaped = s => s.toSource().match(/"(.*)"/)[0];
const toHex = s => Array.from(s).map((_, i) => s.charCodeAt(i).toString(16).padStart(2, '0')).join('+');
['\n', '\\n', '\\\n', '\\\\n', '\\\\\n']
.map(s => ({ a: s, b: s.replace(/\\+[n\n]/g, '\n') }))
.forEach(({a, b}) => console.log(`${toEscaped(a)} --> ${toHex(b)}`))