Include the referrer header when using the chrome.downloads API

Question

Include the referrer header when using the chrome.downloads API

I have been working on developing an extension that replicates the Ctrl+Click to save image functionality from Opera 12. In my extension, I am utilizing chrome.downloads.download() in the background script to trigger downloads, while a content script is responsible for detecting user actions and sending a message containing the image URL to download.

Most of the functionality works smoothly, but I have encountered issues on certain websites like pixiv.net where the downloads are getting interrupted and failing. After inspecting using the webRequest API, I noticed that while the cookie from the active tab is sent with the download request, no referer header is included. My assumption is that these sites block download requests from external sources. Unfortunately, I have not been able to confirm this due to the webRequest.onError event not firing on failed downloads.

The challenge I'm facing is that I am unable to set the referer header manually since it cannot be done through chrome.downloads, and webRequest.onBeforeSendHeaders cannot be applied directly to a download request, making it impossible to add headers later on. Is there a way to initiate a download within the context of a tab to mimic the behavior of right-click > save as...?

To provide more clarity on how I'm initiating the downloads, here is a simplified version of my TypeScript code:

Injected script:

window.addEventListener('click', (e) => {
    if (suspended) {
        return;
    }

    if (e.ctrlKey && (<Element>e.target).nodeName === 'IMG') {
        chrome.runtime.sendMessage({
            url: (<HTMLImageElement>e.target).src,
            saveAs: true,
        });

        e.preventDefault();
        e.stopImmediatePropagation();

        suspended = true;
        window.setTimeout(() => suspended = false, 100);
    }
}, true);

Background script:

interface DownloadMessage {
    url: string;
    saveAs: boolean;
}

chrome.runtime.onMessage.addListener((message: DownloadMessage, sender, sendResponse) => {
    chrome.downloads.download({
        url: message.url,
        saveAs: message.saveAs,
    });
});

Update

Building upon ExpertSystem's solution below, I've come up with an approach that mostly resolves the issue. Now, when the background script receives a download request for an image, I check the host name of the URL against a list of specified sites that require a workaround. If a workaround is necessary, I send a message back to the tab to handle the download using an XMLHttpRequest with responseType = 'blob'. Then, I use URL.createObjectURL() on the blob and transmit that URL back to the background script for downloading. This method avoids any size limitations associated with data URIs. Additionally, in case the XHR request fails, I ensure to retry using the standard method to display a failed download prompt to the user.

The updated code structure looks like this:

Injected Script:

// ...Original code here...

chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
    switch (message.action) {
        case 'xhr-download':
            var xhr = new XMLHttpRequest();
            xhr.responseType = 'blob';

            xhr.addEventListener('load', (e) => {
                chrome.runtime.sendMessage({
                    url: URL.createObjectURL(xhr.response),
                    filename: message.url.substr(message.url.lastIndexOf('/') + 1),
                    saveAs: message.saveAs,
                });
            });

            xhr.addEventListener('error', (e) => {
                chrome.runtime.sendMessage({
                    url: message.url,
                    saveAs: message.saveAs,
                    forceDownload: true,
                });
            });

            xhr.open('get', message.url, true);
            xhr.send();
            break;
    }
});

Background Script:

interface DownloadMessage {
    url: string;
    saveAs: boolean;
    filename?: string;
    forceDownload?: boolean;
}

chrome.runtime.onMessage.addListener((message: DownloadMessage, sender, sendResponse) => {
    var downloadOptions = {
        url: message.url,
        saveAs: message.saveAs,
    };

    if (message.filename) {
        options.filename = message.filename;
    }

    var a = document.createElement('a');
    a.href = message.url;

    if (message.forceDownload || a.protocol === 'blob:' || !needsReferralWorkaround(a.hostname)) {
        chrome.downloads.download(downloadOptions);

        if (a.protocol === 'blob:') {
            URL.revokeObjectUrl(message.url);
        }
    } else {
        chrome.tabs.sendMessage(sender.tab.id, { 
            action: 'xhr-download', 
            url: message.url, 
            saveAs: message.saveAs
        });
    }
});

This solution may encounter challenges if the image is hosted on a different domain than the page and the server does not send CORS headers. However, I believe there won't be many sites restricting image downloads based on referrer and serving images from a distinct domain.

javascript google-chrome-extension

Answer 1

Answer №1

The chrome.download.download() method's first argument (options) can contain the headers property, which is an array of objects:

Extra HTTP headers to include with the request if the URL utilizes the HTTP[s] protocol. Each header is a dictionary with keys like name and either value or binaryValue, following limitations set by XMLHttpRequest.

UPDATE:

Unfortunately, the clause "restricted to those allowed by XMLHttpRequest" complicates things, especially since even the Referer header isn't permitted for XHR requests.

I've spent some time exploring this issue but haven't found a fully satisfactory solution or workaround yet. However, I have made progress, so I'll share my findings here in case they prove helpful to someone else. (Moreover, upcoming enhancements in HTML5 specifications could potentially make this a viable workaround.)

One straightforward and quick alternative (though with a notable drawback) is creating and virtually "clicking" on a link (<a> element) that points directly to the image source URL.

Pros:

Simple and fast to implement.
Avoids issues related to headers and CORS (further details below) as it operates within the webpage context.
No need for the chrome.downloads API.
No background page required.

Cons:

Lack of control over where the file is saved (although you can specify the filename) and whether a file dialog appears.

If the default download location works for your needs, this approach might be suitable :)

Implementation:

var suspended = false;
window.addEventListener('click', function(evt) {
    if (suspended) {
        return;
    }

    if (evt.ctrlKey && (evt.target.nodeName === 'IMG')) {
        var a = document.createElement('a');
        a.href = evt.target.src;
        a.target = '_blank';
        a.download = a.href.substring(a.href.lastIndexOf('/') + 1);
        a.click();

        evt.preventDefault();
        evt.stopImmediatePropagation();

        suspended = true;
        window.setTimeout(function() {
            suspended = false;
        }, 100);
    }
}, true);

In an attempt to overcome the limitations of the above solution, I experimented with the following process:

When an image is Ctrl+clicked, the source URL is sent to a background page.
The background page opens the URL in a new tab, ensuring the tab has the same origin as the images - critical later on.
The background page injects code into the new tab that:
a. Creates a <canvas> element and draws the image onto it. b. Converts the drawn image to a dataURL(*). c. Returns the dataURL to the background page for further action.
The background page receives the dataURL, closes the opened tab, and triggers a download using chrome.downloads.download() along with the received dataURL as the url value.

(*): To address "Information leakage," the canvas element prohibits converting an image to a dataURL unless it shares the same origin as the current webpage. Hence, opening the image's source URL in a new tab was necessary.

Pros:

Enables control over displaying the file dialog.
Includes all conveniences offered by the chrome.downloads API (whether needed or not).
Almost meets expectations :/

Cons - Caveats:

Relatively slow due to loading the image in the new tab.
The maximum image size limit depends on the maximum URL length allowed. While specifics aren't readily available, estimates put acceptable image sizes at a few hundred MB, posing a limitation.
Mainly, canvas' toDataURL() returns data at 96dpi, which poses a challenge for high-resolution images.
Fortunately, the sibling method toDataURLHD() provides data at the native canvas bitmap resolution.
However, Google Chrome currently does not support toDataURLHD().

More information on canvas, toDataURL(), and toDataURLHD() methods can be found here. Hopefully, future support will reintroduce this solution :)

Implementation:

An extension sample would comprise three files:

manifest.json: The manifest
content.js: The content script
background.js: The background page

manifest.json:

{
    "manifest_version": 2,

    "name":    "Test Extension",
    "version": "0.0",
    "offline_enabled": false,

    "background": {
        "persistent": false,
        "scripts": ["background.js"]
    },

    "content_scripts": [{
        "matches":    ["*://*/*"],
        "js":         ["content.js"],
        "run_at":     "document_idle",
        "all_frames": true
    }],

    "permissions": [
        "downloads",
        "*://*/*"
    ],
}

content.js:

var suspended = false;
window.addEventListener('click', function(evt) {
    if (suspended) {
        return;
    }

    if (evt.ctrlKey && (evt.target.nodeName === 'IMG')) {

        /* Initialize the "download" process
         * for the specified image's source-URL */
        chrome.runtime.sendMessage({
            action: 'downloadImgStart',
            url:    evt.target.src
        });

        evt.preventDefault();
        evt.stopImmediatePropagation();

        suspended = true;
        window.setTimeout(function() {
            suspended = false;
        }, 100);
    }
}, true);

background.js:

/* This function, injected into the tab with the image,
 * handles putting the image into a canvas and converting it to a dataURL
 * to be sent back to the background page for processing */
var imgToDataURL = function() {
    /* Determine image details like name, type, quality */
    var src     = window.location.href;
    var name    = src.substring(src.lastIndexOf('/') + 1);
    var type    = /\.jpe?g([?#]|$)/i.test(name) ? 'image/jpeg' : 'image/png';
    var quality = 1.0;

    /* Load image onto canvas and convert to dataURL */
    var img = document.body.querySelector('img');
    var canvas = document.createElement('canvas');
    canvas.width = img.naturalWidth;
    canvas.height = img.naturalHeight;
    var ctx = canvas.getContext('2d');
    ctx.drawImage(img, 0, 0);
    var dataURL = canvas.toDataURL(type, quality);

    /* Update `name` if the specified type isn't supported */
    if ((type !== 'image/png') && (dataURL.indexOf('data:image/png') === 0)) {
        name += '.png';
    }

    /* Send dataURL and `name` back to background page */
    chrome.runtime.sendMessage({
        action: 'downloadImgEnd',
        url:    dataURL,
        name:   name
    });
}

/* Inject into webpage containing image */
var codeStr = '(' + imgToDataURL + ')();';

/* Listen for messages from content scripts */
chrome.runtime.onMessage.addListener(function(msg, sender) {

    /* Validate message contains 'URL' */
    if (!msg.url) {
        console.log('Invalid message format: ', msg);
        return;
    }

    switch (msg.action) {
    case 'downloadImgStart':
        /* Request from original page:
         * Open image's source URL in a new unfocused tab
         * (avoid CORS-related errors) and inject 'imgToDataURL' */
        chrome.tabs.create({
            url: msg.url,
            active: false
        }, function(tab) {
            chrome.tabs.executeScript(tab.id, {
                code:      codeStr,
                runAt:     'document_idle',
                allFrames: false
            });
        });
        break;
    case 'downloadImgEnd':
        /* DataURL acquired successfully!
         * Close background tab and initiate download */
        chrome.tabs.remove(sender.tab.id);
        chrome.downloads.download({
            url:      msg.url,
            filename: msg.name || '',
            saveAs:   true
        });
        break;
    default:
        /* Report invalid message 'action' */
        console.log('Invalid action: ', msg.action, ' (', msg, ')');
        break;
    }
});

_{Apologies for the lengthy response (which doesn't present a definitive solution).

Hopefully, there are useful insights here (or it saves others time experimenting with similar approaches).}

Answer 2

The chrome.download.download() method's first argument (options) can contain the headers property, which is an array of objects:

Extra HTTP headers to include with the request if the URL utilizes the HTTP[s] protocol. Each header is a dictionary with keys like name and either value or binaryValue, following limitations set by XMLHttpRequest.

UPDATE:

Unfortunately, the clause "restricted to those allowed by XMLHttpRequest" complicates things, especially since even the Referer header isn't permitted for XHR requests.

I've spent some time exploring this issue but haven't found a fully satisfactory solution or workaround yet. However, I have made progress, so I'll share my findings here in case they prove helpful to someone else. (Moreover, upcoming enhancements in HTML5 specifications could potentially make this a viable workaround.)

One straightforward and quick alternative (though with a notable drawback) is creating and virtually "clicking" on a link (<a> element) that points directly to the image source URL.

Pros:

Simple and fast to implement.
Avoids issues related to headers and CORS (further details below) as it operates within the webpage context.
No need for the chrome.downloads API.
No background page required.

Cons:

Lack of control over where the file is saved (although you can specify the filename) and whether a file dialog appears.

If the default download location works for your needs, this approach might be suitable :)

Implementation:

var suspended = false;
window.addEventListener('click', function(evt) {
    if (suspended) {
        return;
    }

    if (evt.ctrlKey && (evt.target.nodeName === 'IMG')) {
        var a = document.createElement('a');
        a.href = evt.target.src;
        a.target = '_blank';
        a.download = a.href.substring(a.href.lastIndexOf('/') + 1);
        a.click();

        evt.preventDefault();
        evt.stopImmediatePropagation();

        suspended = true;
        window.setTimeout(function() {
            suspended = false;
        }, 100);
    }
}, true);

In an attempt to overcome the limitations of the above solution, I experimented with the following process:

When an image is Ctrl+clicked, the source URL is sent to a background page.
The background page opens the URL in a new tab, ensuring the tab has the same origin as the images - critical later on.
The background page injects code into the new tab that:
a. Creates a <canvas> element and draws the image onto it. b. Converts the drawn image to a dataURL(*). c. Returns the dataURL to the background page for further action.
The background page receives the dataURL, closes the opened tab, and triggers a download using chrome.downloads.download() along with the received dataURL as the url value.

(*): To address "Information leakage," the canvas element prohibits converting an image to a dataURL unless it shares the same origin as the current webpage. Hence, opening the image's source URL in a new tab was necessary.

Pros:

Enables control over displaying the file dialog.
Includes all conveniences offered by the chrome.downloads API (whether needed or not).
Almost meets expectations :/

Cons - Caveats:

Relatively slow due to loading the image in the new tab.
The maximum image size limit depends on the maximum URL length allowed. While specifics aren't readily available, estimates put acceptable image sizes at a few hundred MB, posing a limitation.
Mainly, canvas' toDataURL() returns data at 96dpi, which poses a challenge for high-resolution images.
Fortunately, the sibling method toDataURLHD() provides data at the native canvas bitmap resolution.
However, Google Chrome currently does not support toDataURLHD().

More information on canvas, toDataURL(), and toDataURLHD() methods can be found here. Hopefully, future support will reintroduce this solution :)

Implementation:

An extension sample would comprise three files:

manifest.json: The manifest
content.js: The content script
background.js: The background page

manifest.json:

{
    "manifest_version": 2,

    "name":    "Test Extension",
    "version": "0.0",
    "offline_enabled": false,

    "background": {
        "persistent": false,
        "scripts": ["background.js"]
    },

    "content_scripts": [{
        "matches":    ["*://*/*"],
        "js":         ["content.js"],
        "run_at":     "document_idle",
        "all_frames": true
    }],

    "permissions": [
        "downloads",
        "*://*/*"
    ],
}

content.js:

var suspended = false;
window.addEventListener('click', function(evt) {
    if (suspended) {
        return;
    }

    if (evt.ctrlKey && (evt.target.nodeName === 'IMG')) {

        /* Initialize the "download" process
         * for the specified image's source-URL */
        chrome.runtime.sendMessage({
            action: 'downloadImgStart',
            url:    evt.target.src
        });

        evt.preventDefault();
        evt.stopImmediatePropagation();

        suspended = true;
        window.setTimeout(function() {
            suspended = false;
        }, 100);
    }
}, true);

background.js:

/* This function, injected into the tab with the image,
 * handles putting the image into a canvas and converting it to a dataURL
 * to be sent back to the background page for processing */
var imgToDataURL = function() {
    /* Determine image details like name, type, quality */
    var src     = window.location.href;
    var name    = src.substring(src.lastIndexOf('/') + 1);
    var type    = /\.jpe?g([?#]|$)/i.test(name) ? 'image/jpeg' : 'image/png';
    var quality = 1.0;

    /* Load image onto canvas and convert to dataURL */
    var img = document.body.querySelector('img');
    var canvas = document.createElement('canvas');
    canvas.width = img.naturalWidth;
    canvas.height = img.naturalHeight;
    var ctx = canvas.getContext('2d');
    ctx.drawImage(img, 0, 0);
    var dataURL = canvas.toDataURL(type, quality);

    /* Update `name` if the specified type isn't supported */
    if ((type !== 'image/png') && (dataURL.indexOf('data:image/png') === 0)) {
        name += '.png';
    }

    /* Send dataURL and `name` back to background page */
    chrome.runtime.sendMessage({
        action: 'downloadImgEnd',
        url:    dataURL,
        name:   name
    });
}

/* Inject into webpage containing image */
var codeStr = '(' + imgToDataURL + ')();';

/* Listen for messages from content scripts */
chrome.runtime.onMessage.addListener(function(msg, sender) {

    /* Validate message contains 'URL' */
    if (!msg.url) {
        console.log('Invalid message format: ', msg);
        return;
    }

    switch (msg.action) {
    case 'downloadImgStart':
        /* Request from original page:
         * Open image's source URL in a new unfocused tab
         * (avoid CORS-related errors) and inject 'imgToDataURL' */
        chrome.tabs.create({
            url: msg.url,
            active: false
        }, function(tab) {
            chrome.tabs.executeScript(tab.id, {
                code:      codeStr,
                runAt:     'document_idle',
                allFrames: false
            });
        });
        break;
    case 'downloadImgEnd':
        /* DataURL acquired successfully!
         * Close background tab and initiate download */
        chrome.tabs.remove(sender.tab.id);
        chrome.downloads.download({
            url:      msg.url,
            filename: msg.name || '',
            saveAs:   true
        });
        break;
    default:
        /* Report invalid message 'action' */
        console.log('Invalid action: ', msg.action, ' (', msg, ')');
        break;
    }
});

_{Apologies for the lengthy response (which doesn't present a definitive solution).

Hopefully, there are useful insights here (or it saves others time experimenting with similar approaches).}

Include the referrer header when using the chrome.downloads API

Answer №1

Similar questions

Are there Alternatives for Handling Timeout Exceptions in Selenium WebDriver?

Tips for validating multiple forms on a single page without altering the JavaScript validation structure

Tips for maintaining a healthy balance of tasks in libuv during IO operations

Enhance the functionality of the 'validate as true' function

Struggling to accurately determine the intersection face vertex positions in Three.js

The crash during compilation is triggered by the presence of react-table/react-table.css in the code

Jade fails to show image in route with parameter

Searching for Node.js tutorials using Google API on YouTube

Clicking on the menu in mobile view will cause it to slide upward

Guide on effectively sorting the second level ng-repeat data in a table

Incomplete DOM elements in jQuery are like puzzles that need to be solved

Issue: Unable to locate element with the specified selector: #email

Delete the class when the user clicks anywhere on the page

After developing a React application to fetch data from my own API, I encountered the following error message: "TypeError: video.map is not a function". See the code snippet below:

Changing the structure of a JSON array in JavaScript

How can I detect when an image is loaded in a specific container division using jQuery?

Utilizing react.js and passing props as parameters in recursive functions

Executing JavaScript POST Requests Based on User Input Changes

My simple application is experiencing a problem where ComponentDidMount is not being invoked

Building a straightforward RESTful API for user authentication with Node.js, MongoDB, and Express.js