Although this question may be dated, I encountered a similar issue while experimenting with a document fragment. I learned that appending a div to it and utilizing the div's innerHTML
is essential for loading strings of HTML and obtaining DOM elements. If you require solutions for handling entire documents, I have alternative methods.
In my experience with Firefox (23.0.1), setting the innerHTML property of a document fragment does not automatically generate elements. This only occurs after attaching the fragment to the document itself.
For creating complete documents, consider using the document.implementation
methods if they are supported. I have successfully used this approach in Firefox, although I have not extensively tested it across other browsers. Refer to HTMLParser.js in AtropaToolbox for an example of employing document.implementation
methods. I have utilized this script for parsing XMLHttpRequest
pages and extracting data without executing page scripts, as per my requirements at the time. I opted for this verbose method due to parsing errors encountered when directly using the parsing capabilities of the XMLHttpRequest
object. Specifying the document to be parsed as HTML 4 Transitional allowed flexibility in processing various content structures into a DOM format.
An alternative to consider is the DOMParser
, which offers simpler usability. Eli Grey provides an implementation on MDN for browsers lacking DOMParser
but supporting
document.implementation.createHTMLDocument
. The
DOMParser
specifications ensure non-execution of scripts within the page and rendering of noscript tag contents.
If script execution is imperative, one can create a hidden iframe with minimized dimensions and borders to embed the page discreetly. Additionally, options such as window.open()
with document.write
or DOM manipulation provide versatility, with some browsers even accommodating data URIs.
var x = window.open( 'data:text/html;base64,' + btoa('<h1>hi</h1>') );
// Wait for the document to load. Only takes a few milliseconds
// But we'll wait for 5 seconds for observation purposes
setTimeout(function () {
console.log(x.document.documentElement.outerHTML);
x.console.log('This is the console in the child window');
x.document.body.innerHTML = 'Oh wow';
}, 5000);
Various approaches exist for creating and manipulating complete documents offscreen or hidden, all facilitating loading documents from strings effectively.
Consider exploring PhantomJS, an innovative project featuring a headless scriptable web browser based on WebKit. With access to the local filesystem, the possibilities are extensive for accomplishing your intended goals with full-page scripting and manipulation.