Strip all whitespace from an entire HTML document, excluding any content within a <pre> tag,

While working on ASP.NET MVC 3, I implemented an Action Filter to remove white spaces from the entire HTML. It has been functioning as expected most of the time but now I need to tailor the RegEx so that it does not affect content inside the pre element.

I sourced the RegEx logic from the insightful blog of the talented Mads Kristensen, however, I am unsure how to adapt it for this specific purpose.

Here is the current logic:

public override void Write(byte[] buffer, int offset, int count) {

    string HTML = Encoding.UTF8.GetString(buffer, offset, count);

    Regex reg = new Regex(@"(?<=[^])\t{2,}|(?<=[>])\s{2,}(?=[<])|(?<=[>])\s{2,11}(?=[<])|(?=[\n])\s{2,}");
    HTML = reg.Replace(HTML, string.Empty);

    buffer = System.Text.Encoding.UTF8.GetBytes(HTML);
    this.Base.Write(buffer, 0, buffer.Length);
}

The complete code for the filter can be found here:

https://github.com/tugberkugurlu/MvcBloggy/blob/master/src/MvcBloggy.Web/Application/ActionFilters/RemoveWhitespacesAttribute.cs

Any suggestions?

EDIT:

BIG NOTE:

Please note that my primary goal is not to speed up response time. In fact, this may potentially slow things down. I have GZiped the pages and this minification only saves me approximately 4 - 5 kb per page which is negligible in the grand scheme of things.

Answer №1

Dealing with HTML using regular expressions can be quite complex and any straightforward solutions may easily break. (Always choose the appropriate tool for the task.) With that being said, I will present a simple solution.

Initially, I refined the regex to:

(?<=\s)\s+

To eliminate double spaces throughout, you can replace those matches with an empty string.

If there are no < or > characters within the pre tag, you may include (?![^<>]*</pre>) at the end of the expression to prevent it from matching inside pre tags. This ensures that the current match does not have </pre> immediately following it, with no tags in between.

This results in:

(?<=\s)\s+(?![^<>]*</pre>)

Answer №2

Check out this incredible resource on RegEx matching open tags except XHTML self-contained tags to understand why regular expressions and HTML may not always work well together.

If you're considering using the method above to reduce page size, consider exploring IIS compression, which can be a more efficient and streamlined approach. Here are some guides on how to implement it in both IIS 6 and IIS 7:

http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/502ef631-3695-4616-b268-cbe7cf1351ce.mspx?mfr=true

Answer №3

If you're looking to break this process down into more manageable steps, consider the following four stages:

  1. Use regex to extract any matching PRE elements by searching for text between "<pre>" and "</pre>".
  2. Assign a unique identifier (GUID) to each match and store the GUID in a dictionary along with its corresponding pre element HTML.
  3. Remove unnecessary whitespace from the content while preserving the GUIDs and their positions.
  4. Iterate through the dictionary created in step 2 and place the pre elements back in their original locations.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Issue with Vue.js - GET response not being stored in this.data

<!DOCTYPE html> <html> <head> <title>Welcome to Vue</title> <script src="https://unpkg.com/vue/dist/vue.js"></script> </head> <body> <div id="app"> <button v-on:click="sendTime">Lo ...

Steps for executing the function in the specifications - Protractor

My script is written within the following module: var customSearch = function () { this.clickSearch = function (value) { element(by.id('searchbutton')).click(); }; this.waitElementFound = function (value) { var EC = protractor.Ex ...

Discovering a way to capture the space bar event in a number input field with JavaScript

Is there a way to capture space bar input with an onchange event in JavaScript? <html> <head> <script type = "text/javascript"> function lala(aa){ console.log(aa + " dasda"); } </script> </head> <body> <input ty ...

Restricted Access to ASP.NET Webservice for Authorized Users Only

After developing a webservice in asp.net, I realized that anyone can access it by knowing the URL. Is there a way to make it private and restrict access only to specific individuals? Just a heads up - I am attempting to use this webservice from my iPhone, ...

Using grid-template-areas in ReactJS function components does not function as expected

REMINDER: Check out and customize the code in CodeSandbox. This is a parent component with 5 children elements. Among them, 3 are React components and the remaining 2 are regular HTML buttons: import "./App.css"; import React from "react&qu ...

Is there a way to transfer CSS classes to React children and guarantee they take precedence over the child's own class styles?

I am trying to pass a CSS class from the parent component into the child component. Here is an example: <Parent> <Child /> </Parent> In order to apply a class from the parent to the <Child />, I have done this: <Parent> ...

What is the best way to update React State after making an asynchronous call to MongoDB?

I have been facing a common issue, but couldn't find an up-to-date solution on Stack Overflow specifically for React/Meteor. My goal is to query a mongoDB to retrieve data and then pass it into the state of my React components. Currently, I am queryin ...

Add an empty value to the `StringBuilder`

When reading from an XML file and constructing a string using StringBuilder, there are instances where the Element.Attributes are missing, resulting in a null string. string key = (string)EventId.Descendants("properties").Elements("ID").Attributes("key") ...

Store information temporarily between function calls

Simple Query I need to store user-specific data between two button events without using System.Web.Caching.Cache or Session. Detailed Inquiry After a user makes a search, I load expensive data which I want to reuse later by saving it in a private inst ...

Using Java Script to adjust dimensions in a webpage

As a newcomer to JS, I encountered my first simple problem that I can't seem to solve. Currently working on a basic website using an online editor, I decided to add a shoutbox from shoutbox.com In order to use the shoutbox, I embedded it in an HTML ...

What is the method for transforming an array object into an object?

How can I assign the values of an array object called hello to a constant called answer, changing the key value in the process? I've considered using Object.entries but couldn't quite get it right. Is there a way to achieve this? Here is my cod ...

Step-by-step guide on replacing Express API routes with React Router

I am currently working on an application that utilizes React Routes and is served with an Express server. The Express server also contains routes for API calls. Server.js const express = require('express') const path = require('path') ...

Determine if a specific route path exists within the URL in Angular

http://localhost:4200/dsc-ui/#message but if I change the URL to (remove #message and type application-management) http://localhost:4200/dsc-ui/application-management (/application-management), it should automatically redirect me to http://localhost:4200/d ...

Where in the world is RadGrid's Custom Aggregate?

I am currently working with a radgrid that requires multiple levels of grouping, along with various custom aggregates. This is my first time dealing with custom aggregates, and I'm faced with a scenario where the OnCustomAggregate event is triggered f ...

Using Vue.js for handling events with the passing method

As a newcomer to Vue.js, I am currently trying to understand method event handling. My goal is to read a barcode using input from a web camera and display its raw content. The function works perfectly if I just render the raw content. However, when I att ...

Expanding choice by incorporating additional or default selections with ng-options

I am currently working on a tag-manager feature within an angular form that utilizes two dropdown menus (in this example, a food category and a specific item). The functionality I am aiming for is when a user selects a food category, the item dropdown shou ...

Is there a way to determine the original sender of a forwarded email using Mailkit?

As I handle forwarded emails, I've noticed that when I use a TextSearchQuery with SearchTerm.FromContains, the UniqueIds of the forwarder are retrieved instead of the original sender's email address. I could search in the TextBody or HtmlBody fo ...

Tips for submitting a form using JavaScript without the button causing a page refresh

I am trying to submit a simple form using Ajax, but no matter what I do, the form keeps refreshing the page. My goal is to submit the form with a GET request using JavaScript so that I can receive a JavaScript response from the server. <div id='f ...

Pressing the HTML button will reveal the cart details in a fresh display box

I have been working on setting up a button to display the items in the shopping cart. I have successfully created the cart itself, but now I am facing the challenge of creating a button called showYourCart that will reveal a box containing the cart detai ...

The code in check.js causes a square of dots to emerge on the screen in Skype

Trying to add a Skype call button to my page has been successful, but there's one issue - a pesky white dot keeps appearing at the bottom of the footer. The script source I used is as follows: <script src="http://download.skype.com/share/skypebu ...