In the JavaScript book "The Good Parts", there is an explanation of the method string.match(regexp)
:
The match method works by comparing a string with a regular expression. Its behavior varies based on whether or not the g flag is present. Without the g flag, calling
string.match(regexp)
is essentially the same as usingregexp.exec(string)
. However, if the regex has the g flag, it returns an array of all matches while excluding any capturing groups:
The book then provides a code example:
var text = '<html><body bgcolor=linen><p>This is <b>bold</b>!</p></body></html>';
var tags = /[^<>]+|<(\/?)([A-Za-z]+)([^<>]*)>/g;
var a, i;
a = text.match(tags);
for (i = 0; i < a.length; i += 1) {
document.writeln(('// [' + i + '] ' + a[i]).entityify());
}
// The result is
// [0] <html>
// [1] <body bgcolor=linen>
// [2] <p>
// [3] This is
// [4] <b>
// [5] bold
// [6] </b>
// [7] !
// [8] </p>
// [9] </body>
// [10] </html>
I'm having trouble understanding the concept of "excluding capturing groups".
In the provided code, the html
in </html>
is within a capturing group. Why is it still included in the result array?
Similarly, the /
in </html>
is also part of a capturing group. Why is it showing up in the result array?
Can you please clarify what "excluding capturing groups" means in the context of this code example?
Thank you for your help!