Include the referrer header when using the chrome.downloads API

I have been working on developing an extension that replicates the Ctrl+Click to save image functionality from Opera 12. In my extension, I am utilizing chrome.downloads.download() in the background script to trigger downloads, while a content script is responsible for detecting user actions and sending a message containing the image URL to download.

Most of the functionality works smoothly, but I have encountered issues on certain websites like pixiv.net where the downloads are getting interrupted and failing. After inspecting using the webRequest API, I noticed that while the cookie from the active tab is sent with the download request, no referer header is included. My assumption is that these sites block download requests from external sources. Unfortunately, I have not been able to confirm this due to the webRequest.onError event not firing on failed downloads.

The challenge I'm facing is that I am unable to set the referer header manually since it cannot be done through chrome.downloads, and webRequest.onBeforeSendHeaders cannot be applied directly to a download request, making it impossible to add headers later on. Is there a way to initiate a download within the context of a tab to mimic the behavior of right-click > save as...?

To provide more clarity on how I'm initiating the downloads, here is a simplified version of my TypeScript code:

Injected script:

window.addEventListener('click', (e) => {
    if (suspended) {
        return;
    }

    if (e.ctrlKey && (<Element>e.target).nodeName === 'IMG') {
        chrome.runtime.sendMessage({
            url: (<HTMLImageElement>e.target).src,
            saveAs: true,
        });

        e.preventDefault();
        e.stopImmediatePropagation();

        suspended = true;
        window.setTimeout(() => suspended = false, 100);
    }
}, true);

Background script:

interface DownloadMessage {
    url: string;
    saveAs: boolean;
}

chrome.runtime.onMessage.addListener((message: DownloadMessage, sender, sendResponse) => {
    chrome.downloads.download({
        url: message.url,
        saveAs: message.saveAs,
    });
});

Update

Building upon ExpertSystem's solution below, I've come up with an approach that mostly resolves the issue. Now, when the background script receives a download request for an image, I check the host name of the URL against a list of specified sites that require a workaround. If a workaround is necessary, I send a message back to the tab to handle the download using an XMLHttpRequest with responseType = 'blob'. Then, I use URL.createObjectURL() on the blob and transmit that URL back to the background script for downloading. This method avoids any size limitations associated with data URIs. Additionally, in case the XHR request fails, I ensure to retry using the standard method to display a failed download prompt to the user.

The updated code structure looks like this:

Injected Script:

// ...Original code here...

chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
    switch (message.action) {
        case 'xhr-download':
            var xhr = new XMLHttpRequest();
            xhr.responseType = 'blob';

            xhr.addEventListener('load', (e) => {
                chrome.runtime.sendMessage({
                    url: URL.createObjectURL(xhr.response),
                    filename: message.url.substr(message.url.lastIndexOf('/') + 1),
                    saveAs: message.saveAs,
                });
            });

            xhr.addEventListener('error', (e) => {
                chrome.runtime.sendMessage({
                    url: message.url,
                    saveAs: message.saveAs,
                    forceDownload: true,
                });
            });

            xhr.open('get', message.url, true);
            xhr.send();
            break;
    }
});

Background Script:

interface DownloadMessage {
    url: string;
    saveAs: boolean;
    filename?: string;
    forceDownload?: boolean;
}

chrome.runtime.onMessage.addListener((message: DownloadMessage, sender, sendResponse) => {
    var downloadOptions = {
        url: message.url,
        saveAs: message.saveAs,
    };

    if (message.filename) {
        options.filename = message.filename;
    }

    var a = document.createElement('a');
    a.href = message.url;

    if (message.forceDownload || a.protocol === 'blob:' || !needsReferralWorkaround(a.hostname)) {
        chrome.downloads.download(downloadOptions);

        if (a.protocol === 'blob:') {
            URL.revokeObjectUrl(message.url);
        }
    } else {
        chrome.tabs.sendMessage(sender.tab.id, { 
            action: 'xhr-download', 
            url: message.url, 
            saveAs: message.saveAs
        });
    }
});

This solution may encounter challenges if the image is hosted on a different domain than the page and the server does not send CORS headers. However, I believe there won't be many sites restricting image downloads based on referrer and serving images from a distinct domain.

Answer №1

The chrome.download.download() method's first argument (options) can contain the headers property, which is an array of objects:

Extra HTTP headers to include with the request if the URL utilizes the HTTP[s] protocol. Each header is a dictionary with keys like name and either value or binaryValue, following limitations set by XMLHttpRequest.

UPDATE:

Unfortunately, the clause "restricted to those allowed by XMLHttpRequest" complicates things, especially since even the Referer header isn't permitted for XHR requests.

I've spent some time exploring this issue but haven't found a fully satisfactory solution or workaround yet. However, I have made progress, so I'll share my findings here in case they prove helpful to someone else. (Moreover, upcoming enhancements in HTML5 specifications could potentially make this a viable workaround.)


One straightforward and quick alternative (though with a notable drawback) is creating and virtually "clicking" on a link (<a> element) that points directly to the image source URL.

Pros:

  • Simple and fast to implement.
  • Avoids issues related to headers and CORS (further details below) as it operates within the webpage context.
  • No need for the chrome.downloads API.
  • No background page required.

Cons:

  • Lack of control over where the file is saved (although you can specify the filename) and whether a file dialog appears.

If the default download location works for your needs, this approach might be suitable :)

Implementation:

var suspended = false;
window.addEventListener('click', function(evt) {
    if (suspended) {
        return;
    }

    if (evt.ctrlKey && (evt.target.nodeName === 'IMG')) {
        var a = document.createElement('a');
        a.href = evt.target.src;
        a.target = '_blank';
        a.download = a.href.substring(a.href.lastIndexOf('/') + 1);
        a.click();

        evt.preventDefault();
        evt.stopImmediatePropagation();

        suspended = true;
        window.setTimeout(function() {
            suspended = false;
        }, 100);
    }
}, true);

In an attempt to overcome the limitations of the above solution, I experimented with the following process:

  1. When an image is Ctrl+clicked, the source URL is sent to a background page.
  2. The background page opens the URL in a new tab, ensuring the tab has the same origin as the images - critical later on.
  3. The background page injects code into the new tab that:
    a. Creates a <canvas> element and draws the image onto it. b. Converts the drawn image to a dataURL(*). c. Returns the dataURL to the background page for further action.
  4. The background page receives the dataURL, closes the opened tab, and triggers a download using chrome.downloads.download() along with the received dataURL as the url value.

(*): To address "Information leakage," the canvas element prohibits converting an image to a dataURL unless it shares the same origin as the current webpage. Hence, opening the image's source URL in a new tab was necessary.

Pros:

  • Enables control over displaying the file dialog.
  • Includes all conveniences offered by the chrome.downloads API (whether needed or not).
  • Almost meets expectations :/

Cons - Caveats:

  • Relatively slow due to loading the image in the new tab.
  • The maximum image size limit depends on the maximum URL length allowed. While specifics aren't readily available, estimates put acceptable image sizes at a few hundred MB, posing a limitation.
  • Mainly, canvas' toDataURL() returns data at 96dpi, which poses a challenge for high-resolution images.
    Fortunately, the sibling method toDataURLHD() provides data at the native canvas bitmap resolution.
    However, Google Chrome currently does not support toDataURLHD().

More information on canvas, toDataURL(), and toDataURLHD() methods can be found here. Hopefully, future support will reintroduce this solution :)

Implementation:

An extension sample would comprise three files:

  1. manifest.json: The manifest
  2. content.js: The content script
  3. background.js: The background page

manifest.json:

{
    "manifest_version": 2,

    "name":    "Test Extension",
    "version": "0.0",
    "offline_enabled": false,

    "background": {
        "persistent": false,
        "scripts": ["background.js"]
    },

    "content_scripts": [{
        "matches":    ["*://*/*"],
        "js":         ["content.js"],
        "run_at":     "document_idle",
        "all_frames": true
    }],

    "permissions": [
        "downloads",
        "*://*/*"
    ],
}

content.js:

var suspended = false;
window.addEventListener('click', function(evt) {
    if (suspended) {
        return;
    }

    if (evt.ctrlKey && (evt.target.nodeName === 'IMG')) {

        /* Initialize the "download" process
         * for the specified image's source-URL */
        chrome.runtime.sendMessage({
            action: 'downloadImgStart',
            url:    evt.target.src
        });

        evt.preventDefault();
        evt.stopImmediatePropagation();

        suspended = true;
        window.setTimeout(function() {
            suspended = false;
        }, 100);
    }
}, true);

background.js:

/* This function, injected into the tab with the image,
 * handles putting the image into a canvas and converting it to a dataURL
 * to be sent back to the background page for processing */
var imgToDataURL = function() {
    /* Determine image details like name, type, quality */
    var src     = window.location.href;
    var name    = src.substring(src.lastIndexOf('/') + 1);
    var type    = /\.jpe?g([?#]|$)/i.test(name) ? 'image/jpeg' : 'image/png';
    var quality = 1.0;

    /* Load image onto canvas and convert to dataURL */
    var img = document.body.querySelector('img');
    var canvas = document.createElement('canvas');
    canvas.width = img.naturalWidth;
    canvas.height = img.naturalHeight;
    var ctx = canvas.getContext('2d');
    ctx.drawImage(img, 0, 0);
    var dataURL = canvas.toDataURL(type, quality);

    /* Update `name` if the specified type isn't supported */
    if ((type !== 'image/png') && (dataURL.indexOf('data:image/png') === 0)) {
        name += '.png';
    }

    /* Send dataURL and `name` back to background page */
    chrome.runtime.sendMessage({
        action: 'downloadImgEnd',
        url:    dataURL,
        name:   name
    });
}

/* Inject into webpage containing image */
var codeStr = '(' + imgToDataURL + ')();';

/* Listen for messages from content scripts */
chrome.runtime.onMessage.addListener(function(msg, sender) {

    /* Validate message contains 'URL' */
    if (!msg.url) {
        console.log('Invalid message format: ', msg);
        return;
    }

    switch (msg.action) {
    case 'downloadImgStart':
        /* Request from original page:
         * Open image's source URL in a new unfocused tab
         * (avoid CORS-related errors) and inject 'imgToDataURL' */
        chrome.tabs.create({
            url: msg.url,
            active: false
        }, function(tab) {
            chrome.tabs.executeScript(tab.id, {
                code:      codeStr,
                runAt:     'document_idle',
                allFrames: false
            });
        });
        break;
    case 'downloadImgEnd':
        /* DataURL acquired successfully!
         * Close background tab and initiate download */
        chrome.tabs.remove(sender.tab.id);
        chrome.downloads.download({
            url:      msg.url,
            filename: msg.name || '',
            saveAs:   true
        });
        break;
    default:
        /* Report invalid message 'action' */
        console.log('Invalid action: ', msg.action, ' (', msg, ')');
        break;
    }
});

Apologies for the lengthy response (which doesn't present a definitive solution).
Hopefully, there are useful insights here (or it saves others time experimenting with similar approaches).

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Are there Alternatives for Handling Timeout Exceptions in Selenium WebDriver?

Currently, utilizing Selenium WebDriver with node.js to scrape extensive weather data spanning multiple decades from a site that loads only one day's worth of data per page at full resolution. The process involves navigating between URLs and waiting f ...

Tips for validating multiple forms on a single page without altering the JavaScript validation structure

My JSP page currently consists of two forms: <body> <form id="1"></form> <form id="2"></form> </body> Only one form is visible at a time when the page is viewed. A JavaScript validation file is used to check the ...

Tips for maintaining a healthy balance of tasks in libuv during IO operations

Utilizing Typescript and libuv for IO operations is crucial. In my current situation, I am generating a fingerprint hash of a particular file. Let's say the input file size is approximately 1TB. To obtain the file's fingerprint, one method involv ...

Enhance the functionality of the 'validate as true' function

I have an object that resembles the following $scope.object = { Title: 'A title', Status: 'Open', Responsible: 'John Doe', Author: 'Jane Doe', Description: 'lorem ipsum dolor sit' } My aim now i ...

Struggling to accurately determine the intersection face vertex positions in Three.js

Hello there! I've been immersed in the world of three.js, dealing with loading models using THREE.JSONLoader. Currently, I'm faced with the task of selecting these objects and their faces, which I've managed to do successfully. Now, my goal ...

The crash during compilation is triggered by the presence of react-table/react-table.css in the code

My code and tests are functioning properly, but I am facing a challenge with my react-table file. The react-table.js API specifies that in order to use their CSS file, I need to include import "react-table/react-table.css"; in my react-table.js file. Howev ...

Jade fails to show image in route with parameter

Currently, I am utilizing express 4 and have images saved in the uploads directory. This is a snippet of my code: app.use(express.static(__dirname + '/uploads')); //This part works app.route('/') .get(function (req, res) { ...

Searching for Node.js tutorials using Google API on YouTube

I'm attempting to utilize the Google APIs in Node for a YouTube search. Currently, I'm following a tutorial found here: https://github.com/google/google-api-nodejs-client/#google-apis-nodejs-client I've managed to get some basic functiona ...

Clicking on the menu in mobile view will cause it to slide upward

I have implemented sticky.js on my website and it is working well. However, when I resize the browser to mobile view and click the main menu button, it goes up and I am unable to close it. I have to scroll up to see it again. How can I make it stick to the ...

Guide on effectively sorting the second level ng-repeat data in a table

I have a collection of data objects that I want to present in a tabular format with filtering capabilities. The current filter, based on the 'name' model, successfully filters the nested object 'family'. However, it does not function as ...

Incomplete DOM elements in jQuery are like puzzles that need to be solved

My issue revolves around jQuery and manipulating DOM elements. I need a specific template setup, as shown below: var threadreply = " <li class='replyItem'>" + " <div class='clearfix'>" + ...

Issue: Unable to locate element with the specified selector: #email

const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://discord.com/register'); await page.screenshot({path: 'b.png ...

Delete the class when the user clicks anywhere on the page

I have 2 buttons that will toggle/remove a class when the user clicks on them. $('.social-toggle').on('click', function() { $('.social-networks').not($(this).next()).removeClass('open-menu') $(this).next().toggl ...

After developing a React application to fetch data from my own API, I encountered the following error message: "TypeError: video.map is not a function". See the code snippet below:

import React, {useEffect, useState} from "react"; import Axios from "axios"; const VideoPage = () => { const [video, setVideo] = useState(null); const [loading, setLoading] = useState(true); useEffect(() => { const fetchVideoData = async() =&g ...

Changing the structure of a JSON array in JavaScript

I'm currently developing an ExpressJS application and I need to send a post request to a URL. My data is being retrieved from a MS SQL database table using Sequelize, and the format looks like this: [ { "x":"data1", "y":& ...

How can I detect when an image is loaded in a specific container division using jQuery?

I have a container with multiple images inside, and I would like to capture an event each time an image finishes loading. For example: <div id="container"> <img src="..." /> <img src="..." /> <img src="..." /> ...

Utilizing react.js and passing props as parameters in recursive functions

When using recursion, I encountered an issue where I am unable to pass the props parameter: class App extends React.Component { constructor(props) { super(props); this.state = { visible: this.props.root, a:this.props.a }; } toggle(){ thi ...

Executing JavaScript POST Requests Based on User Input Changes

Hey there! I'm working on a feature where the input value is populated based on the selection in a dropdown menu. The idea is that when the user picks a code, the corresponding amount will be displayed. However, I want the select box to retain its or ...

My simple application is experiencing a problem where ComponentDidMount is not being invoked

I used a tool called create-react-app to build this simple application. Strangely, the ComponentDidMount method is never getting invoked. import React, { Component } from "react"; class App extends Component { componentDidMount() { console.log("M ...

Building a straightforward RESTful API for user authentication with Node.js, MongoDB, and Express.js

Can someone provide guidance on creating a RESTful API using Node.js, Express.js, and MongoDB? Specifically, I am looking for assistance with writing the schema for login and sign up pages, as well as comparing data in MongoDB using Node.js Express.js. As ...