Locate and extract a URL while also identifying all other occurrences of dots and converting them to a dot followed by a space:
var re = /((?:https?|ftps?):\/\/\S+)|\.(?!\s)/g;
var str = 'See also vadding.Constructions on this term abound.\nSee also vadding.Constructions on this term abound. http://example.com/foo/bar';
var result = str.replace(re, function(m, g1) {
return g1 ? g1 : ". ";
});
document.body.innerHTML = "<pre>" + result + "</pre>";
The regular expression for the URL - (?:https?|ftps?):\/\/\S+
- identifies patterns starting with http
, https
, ftp
, or ftps
, followed by ://
and one or more non-whitespace characters (\S+
). For more complex URL matching expressions, resources like Stack Overflow can provide useful insights. Check out What is a good regular expression to match a URL?.
Explanation of the process in more depth:
The regex
((?:https?|ftps?):\/\/\S+)|\.(?!\s)
presents two options: either identifying a URL (as explained above), or (
|
) recognizing a dot that is not followed by whitespace (
\.(?!\s)
).
IMPORTANT: The use of (?!\s)
serves as a negative lookahead assertion to locate a dot not succeeded by a space.
When utilizing string.replace()
, it's possible to specify an anonymous callback function as the second argument to handle matches and captured groups. In this case, there's one match value (m
) and one capture group value g1
(representing the identified URL). If the URL is found, g1
won't be null. Therefore, return g1 ? g1 : ". ";
preserves group 1 if matched, replacing standalone dots with .
otherwise.