My goal is to recursively mirror a webpage, meaning I want to retrieve all pages within that webpage. Since all the webpages are located in subfolders of one main folder, I thought I could easily accomplish this using wget:
wget --mirror --recursive --page-requisites --adjust-extension --no-parent --convert-links https://www.example.com/
The issue I encountered is that the page gets mirrored before certain JavaScript scripts are executed, and these scripts do not get mirrored along with the rest of the content. These scripts are important as they modify the webpage's Document Object Model (DOM), so I need to find a way to include them in the mirror process. Alternatively, I could wait for the site to finish loading and then mirror the fully loaded webpage (the timing isn't critical).
I have tried mirroring the webpage with PhantomJS, but it seems like recursion is not supported using PhantomJS, or at least I haven't been able to figure out how to do it. I also consulted the wget manual, but couldn't find any suitable options for my requirements.
Is there a way to achieve what I'm looking for? Any suggestions would be greatly appreciated. Thank you.