Managing URLs in HtmlUnitDriver to either block or allow access.It's important

Blocking URLs in PhantomJS and GhostDriver is a simple process. Begin by initializing the driver with a handler:

PhantomJSDriver driver = new PhantomJSDriver();
driver.executePhantomJS(loadFile("/phantomjs/handlers.js"))

Then, set up the handler:

this.onResourceRequested = function (requestData, networkRequest) {
    var allowedUrls = [
        /https?:\/\/localhost.*/,
        /https?:\/\/.*\.example.com\/?.*/
    ];
    var disallowedUrls = [
        /https?:\/\/nonono.com.*/
    ];

    function isUrlAllowed(url) {
        function matches(url) {
            return function(re) {
                return re.test(url);
            };
        }
        return allowedUrls.some(matches(url)) && !disallowedUrls.some(matches(url));
    }

    if (!isUrlAllowed(requestData.url)) {
        console.log("Aborting disallowed request (# " + requestData.id + ") to url: '" + requestData.url + "'");
        networkRequest.abort();
    }
};

I haven't discovered an effective method to achieve this with HtmlUnitDriver. The ScriptPreProcessor mentioned in How to filter javascript from specific urls in HtmlUnit, utilizes WebClient instead of HtmlUnitDriver. Any suggestions?

Answer №1

Customize HtmlUnitDriver by extending it and creating a ScriptPreProcessor for editing content, as well as a HttpWebConnection for controlling access to URLs:

public class FilteringHtmlUnitDriver extends HtmlUnitDriver {

    private static final String[] ALLOWED_URLS = {
            "https?://localhost.*",
            "https?://.*\\.yes.yes/?.*",
    };
    private static final String[] DISALLOWED_URLS = {
            "https?://spam.nono.*"
    };

    public FilteringHtmlUnitDriver(DesiredCapabilities capabilities) {
        super(capabilities);
    }

    @Override
    protected WebClient modifyWebClient(WebClient client) {
        WebConnection connection = applyFilteringToWebConnection(client);
        ScriptPreProcessor preProcessor = applyFilteringToPreProcessor();

        client.setWebConnection(connection);
        client.setScriptPreProcessor(preProcessor);

        return client;
    }

    private ScriptPreProcessor applyFilteringToPreProcessor() {
        return (htmlPage, sourceCode, sourceName, lineNumber, htmlElement) -> manipulateContent(sourceCode);
    }

    private String manipulateContent(String sourceCode) {
        return sourceCode.replaceAll("foo", "bar");        
    }

    private WebConnection applyFilteringToWebConnection(WebClient client) {
        return new HttpWebConnection(client) {
            @Override
            public WebResponse getResponse(WebRequest request) throws IOException {
                String url = request.getUrl().toString();
                WebResponse emptyResponse = new WebResponse(
                        new WebResponseData("".getBytes(), SC_OK, "", new ArrayList<>()), request, 0);

                for (String disallowed : DISALLOWED_URLS) {
                    if (url.matches(disallowed)) {
                        return emptyResponse;
                    }
                }
                for (String allowed : ALLOWED_URLS) {
                    if (url.matches(allowed)) {
                        return super.getResponse(request);
                    }
                }
                return emptyResponse;
            }
        };
    }
}

This implementation allows for both modifying content and restricting access to specific URLs.

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Transmitting various pieces of information using AJAX

Is it possible to send both "credit_uri" and "address" strings in one AJAX request? Currently, only the second string is being sent. How can I include both in the data of the request? $.ajax({ url: '#{add_cards_path}', type: 'POST&apo ...

Leveraging server-side functionality with JavaScript

Recently, I have been enhancing my ASP.NET project by incorporating more JavaScript and AJAX functionality. In the process, I have adjusted parts of my HTML to be generated using JavaScript. However, this modification has made it challenging for me to acce ...

Ways to extract particular items from a JSON array and store them in a JavaScript array

I am dealing with an external JSON-file structured as follows: { "type":"FeatureCollection", "totalFeatures":1, "features": [{ "type":"Feature", "id":"asdf", "geometry":null, "properties": { "PARAM1":"19 16 11", ...

Quickest method for skimming through an extremely lengthy document beginning at any specified line X

In my current project, there is a text file that is written to by a python program and read by another program to display on a web browser. JavaScript handles the reading process at the moment, but I am considering moving this functionality to python. The ...

Choosing a language - Sending dropdown selection from HTML5 to PHP

Is there a way to store the selected language from an HTML5 dropdown menu into a variable for later use in PHP? Currently, I have hardcoded the initial value of the variable in the code itself. I want to pass the selected language dynamically from HTML t ...

Dealing with a Node and Express server can be tricky, especially when trying to proxy a POST request along with parameters. You might encounter the error

I am trying to forward all requests made to /api/ from my local node server to a remote server, while also adding some authentication parameters to them. Everything works smoothly for GET requests with query parameters and POST requests without specifying ...

Utilizing Firebase login to integrate with Facebook API

My current setup involves Facebook authentication tied to login, all managed through Firebase. However, I now face the need to make an API call to Facebook using 'me/friends/' endpoint without having to send another request since I am already log ...

Error message: Unable to locate Bootstrap call in standalone Angular project after executing 'ng add @angular/pwa' command

Having an issue while trying to integrate @angular/pwa, it keeps showing me an error saying "Bootstrap call not found". It's worth mentioning that I have removed app.module.ts and am using standalone components in various places without any module. Cu ...

What is the best way to implement a calendar view for selecting time periods on an HTML page?

Is there a way to implement a calendar view when clicking on the input box within an HTML page? I am looking to create a function where users can select a time period to generate a file. How can this be accomplished? If possible, could you share some samp ...

A guide on how to export column headers along with table data from a website to Excel using a webdriver

Challenge The code below currently allows me to either export column headers or table data from a web page into a CSV file, depending on whether I select 'th' or 'td' tags. However, I am unable to export both the header and data simulta ...

What is the process for creating a callback wrapper?

I'm currently using ajaxSubmit in my code. form.ajaxSubmit(successCallback); Now, I am looking to create a wrapper for the successCallback function. Any ideas on how I can accomplish this? I've attempted the following: form.ajaxSubmit(wrappe ...

Disable the swipe feature on a Bootstrap carousel to prevent users from navigating through slides on mobile devices. The attribute data-touch="

I've been attempting to deactivate the swipe function on my Bootstrap 4 carousel in my project, but it's proven to be quite challenging. Despite it being a basic carousel, I'm finding it difficult to turn off this feature. What am I missing ...

Unable to delete a JSON object containing an empty value

Currently, I am dealing with data retrieved from an API that includes both title and author attributes (referred to as title_1 and title_2 in this case). To streamline the process of saving data to my database, I have set a condition where an item is deeme ...

Interact with elements using Selenium with dynamic target IDs and AJAX

Currently, I am utilizing selenium for automating various IT administrative tasks. One specific task involves swapping out external drives on a NAS system accessed through an internal webpage. The web interface of the NAS appears to utilize AJAX, which dyn ...

The appearance of Recaptcha buttons is unattractive and seemingly impossible to

Let me start by saying that I have already looked into the issue of "Recaptcha is broken" where adjusting the line-height was suggested as a solution. Unfortunately, that did not work for me. After implementing Google's impressive Recaptcha on my web ...

Looking up a destination with the Google Places API

My dilemma lies in dealing with an array of place names such as 'Hazrat Nizamuddin Railway Station, New Delhi, Delhi, India' and similar variations. These variations serve as alternative names for the same location, adding complexity to my task. ...

Right now, I am sending out 3 GET requests for JSON files using Axios. I wonder if they are being loaded simultaneously or one after the other

In the process of developing my application, I am loading 3 JSON files to gather information about a game's characters, spells, and more. As of now, I have implemented 3 functions that utilize axios to make GET requests and store the responses. Howeve ...

Tips for resolving the 'IHasInputDevices' warning in Selenium: Use the Actions or ActionsBuilder class for simulating keyboard and mouse input

After updating our solution's Selenium NuGet packages to version 3.141.0 (both Selenium.WebDriver and Selenium.Support), a warning appeared for the IHasInputDevices interface: 'IHasInputDevices' is now obsolete. 'Use the Actions or Ac ...

Vue.js throws an error because it is unable to access the "title" property of an undefined value

I am currently facing an error and unable to find a solution. Even after changing the title to 'title', the error persists. methods.vue: <template> <div> <h1>we may include some data here, with data number {{ counter ...

Is there a way to have the span update even if the input stays the same? Currently, it only changes when the input is different

Retrieve results of 3 lines (Ps) by entering a word in the text area and clicking search. If the word is found after clicking the button, a span will be displayed with the count of occurrences as well as the highlighted P(s) where it was found. If the wo ...