I need to tokenize a string using a regular expression that consists of markdown formatting. Specifically, bold text is denoted by **text** and italic text is denoted by _text_.
- To tokenize the string "a _b_ c **d** _e", it should be split into ['a ', 'b', ' c', 'd', '_e'] (Note: each match needs to be stored in its own group).
I have successfully captured bold and italic groups using the regex /_(.+?)_|\*\*(.+?)\*\*/g, but I am trying to expand this regex to include all other text as well. Essentially, I want to capture everything inside **, everything inside _, and the rest of the text.
I attempted to add another case with /_(.+?)_|\*\*(.+?)\*\*|(.*)/g, but this ends up capturing the previous cases as well.
(A quick way to test this in the browser console: Array.from('a _b_ c **d** _e'.matchAll(/_(.+?)_|\*\*(.+?)\*\*/g)))