I'm currently working on a WebApp that includes a feature for quick searching articles.
The structure of the feature can be described in two words:
- Page
- A global array (json, containing 100-150 items) with articles fetched through ajax. The fields include: id, title, snippet. Titles & snippets may contain simple style markup tags.
When a user types a query in the popup quick search field, the app does the following:
- Searches within the global array
- If matches are found, they are added to a temporary search results array (with cache)
- Highlights the matches in the temp. results array and displays them to the user
It is important to note that the original array remains unmodified.
Currently, I am using basic String.indexOf method, but it cannot accurately match text within HTML-formatted text as shown below:
The question pertains to RegEx patterns. While it is not recommended to manipulate the DOM using RegEx and the expected results may not align semantically, it serves the purpose.
For instance:
<ul><li>Item <i><span style="color:red">Y</span></i></li></ul>
and we want to highlight the letter e
, the expected result should be:
... It<em>e</em>m ...
. However, using a simple replace(/e/ig, '<em>$&</em>') will also target the letter 'e' within the style attribute.
In other words, what RegEx pattern can be used to avoid affecting words within HTML tags?
Another example: if we want to highlight Item Y
, the desired output would be
<ul><li><em>Item <i><span style="color:red">Y</em></span></i></li></ul>