Is there a way to split a sentence with special characters into words while keeping the spaces? For example:
"la sílaba tónica es la penúltima".split(...regex...)
to:
["la ", "sílaba ", "tónica ", "es ", "la ", "penúltima"]
↑ ↑ ↑ ↑
space space space space
I've attempted modifying this answer:
Using the code from that answer:
"la sílaba tónica es la penúltima".split(/\b(?![\s.])/)
The result is:
["la ", "s", "í", "laba ", "t", "ó", "nica ", "es ", "la ", "pen", "ú", "ltima"]
↑ ↑ ↑
The special characters cause the words to split.
In my attempt, I included the special characters I want to keep (.áéíóúñ,:;?
):
"la sílaba tónica es la penúltima".split(/\b(?![\s.áéíóúñ,:;?])/)
The result is:
["la ", "sí", "laba ", "tó", "nica ", "es ", "la ", "penú", "ltima"]
↑ ↑ ↑
Although the characters are included, the words break after them. What would be the correct regular expression for this task?