Unable to transfer array elements to a function within PhantomJS

I am facing a challenge in extracting the source code from multiple webpages simultaneously. The links are stored in an array from a source text file. While I am able to successfully loop through the array and display the links, I encounter an issue when passing them through a function as they turn undefined after the initial iteration.

My main objective is to save the source code of each webpage into its own document. The first page is saved accurately, but the subsequent efforts result in undefined values. Despite hours of searching, I would greatly appreciate any guidance to lead me in the right direction.

var fs = require('fs');
var pageContent = fs.read('input.txt');
var arrdata = pageContent.split(/[\n]/);
var system = require('system');
var page = require('webpage').create();
var args = system.args;
var imagelink;
var content = " ";

function handle_page(file, imagelink){
    page.open(file,function(){
        var js = page.evaluate(function (){
            return document;
        });
        fs.write(imagelink, page.content, 'w');
        setTimeout(next_page(),500);
    });
}
function next_page(imagelink){
    var file = imagelink;
    if(!file){phantom.exit(0);}
    handle_page(file, imagelink);
}

for(var i in arrdata){
    next_page(arrdata[i]);
}

Upon reflection, I now understand that the for loop runs only once, while the other two functions initiate their own loops, which explains the issue I am facing. I am still struggling to find a solution to make it work properly.

Answer №1

When dealing with PhantomJS's page.open(), it's important to remember that it is asynchronous, hence the need for a callback function. Additionally, page.open() is a time-consuming operation. If multiple calls are made on the same page object, the second one will overwrite the first one.

One effective approach is to use recursion:

function handle_page(i){
    if (arrdata.length === i) {
        phantom.exit();
        return;
    }
    var imageLink = arrdata[i];
    page.open(imageLink, function(){
        fs.write("file_"+i+".html", page.content, 'w');
        handle_page(i+1);
    });
}
handle_page(0);

A few other considerations:

  • setTimeout(next_page(),500); will immediately invoke next_page(), whereas setTimeout(next_page, 500); will delay the execution. However, without an argument, next_page will simply exit.
  • In the line
    fs.write(imagelink, page.content, 'w')
    , if imagelink is a URL, you may want to find a different way to generate a filename.
  • While
    for(var i in arrdata){ next_page(arrdata[i]); }
    may work in this scenario, it may not be suitable for all arrays. Consider using traditional for loops or the forEach() method for reliable iteration.
  • Remember that page.evaluate() is sandboxed and only allows JSON serializable data to be passed out of it. Make sure to serialize any non-serializable data before attempting to return it from evaluate().

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

What is the best way to iterate through JSON object values using PHP?

How can I iterate through the values of a JSON Object in PHP? $json = '{"1":a,"2":b,"3":c,"4":d,"5":e}'; $obj = json_decode($json, TRUE); foreach($obj as $value) { echo $value; } I'm trying to display abcde, which are the values stored ...

SPRING --- Tips for sending an array of objects to a controller in Java framework

Can someone help me with this issue? I am using AngularJS to submit data to my Spring controller: @RequestParam(value = "hashtag[]") hashtag[] o The above code works for array parameters but not for an array object. This is my JavaScript script: $http ...

Transferring information from AJAX to PHP script with the click of a button

Simply put, I am in the process of adding a pop-up update panel to my to-do website using an HTML button. The website already has a login-register system and uses MySQL queries to display different tables for each user. The update buttons on the website c ...

Identify support for the :first-child pseudo-class

Is there a way to determine with JavaScript whether the browser is compatible with the CSS :first-child selector? ...

Is the value incorrect when using angular's ng-repeat?

Currently iterating through an array nested within an array of objects like this: <div ng-repeat="benefit in oe.oeBenefits"> <div class="oeInfo" style="clear: both;"> <div class="col-md-2 oeCol"> <img style="he ...

Experiencing unexpected output from Angular model class method

I have developed a user-friendly Invoicing & Inventory management application that showcases a list of invoices for each customer. However, there seems to be an issue with the calculation of the Grand Total function, which I am struggling to rectify due to ...

activated by selecting a radio button, with a bootstrap dropdown menu

I'm having trouble triggering the dropdown in Bootstrap by clicking on a radio button. It seems like a simple task, but I've been struggling with it all day. According to Bootstrap documentation, you can activate the dropdown using a hyperlink o ...

What is the best way to compare the position-x of two components that are aligned on the same line?

Check out this code snippet: <input id="homebutton" type="image" style="display:inline" src="home.png" name="saveform" class="btTxt submit ml-3" onclick="location.href='home.html&apos ...

As you scroll, the top block in each of the three columns will remain fixed within its

I need assistance with a problem involving three columns and multiple blocks within each column. Specifically, I want the first block in each column to remain fixed at the top when scrolling. However, once you reach the bottom of each column, the first blo ...

collection of assurances and the Promise.all() method

Currently, I am dealing with an array of Promises that looks like this: let promisesArray = [ service1.load('blabla'), service2.load(), // throws an error ]; My goal is to execute all these Promises and handle any errors that occur, as ...

How can the string '0' be transformed into the number 0 using JavaScript?

My current challenge involves error handling for 2 input values, where I am using regex to ensure that the input is always a number. The issue arises when I want to avoid triggering error handling if the user enters '0'. Currently, I am using the ...

Requires a minimum of two page refreshes to successfully load

Our website is currently hosted on Firebase. However, there seems to be an issue as we have to refresh the website at least twice in order for it to load when visiting www.website.com. Update: We are unsure of what could be causing this problem. W ...

What is the most effective method for identifying duplicate values in a multidimensional array using typescript or javascript?

I have a 2D array as shown below: array = [ [ 1, 1 ], [ 1, 2 ], [ 1, 1 ], [ 2, 3 ] ] I am looking to compare the values in the array indexes to check for duplicates. For example array[0] = [1,1]; array[1] = [1,2]; array[2] = [1,1]; We can see that ...

regarding unfamiliar functions in code and their mysterious purposes

My journey learning Vue.js has been going well, but I've hit a roadblock. Can someone explain the meaning of _. in the following code snippet? ...

Storing raw HTML in a Mysql database, then fetching and displaying it on a webpage

I'm having an issue with my application. It's a form builder that allows users to create their own forms and then save them for later use. The HTML code generated by the form builder is stored in a MySQL database. However, when I try to retrieve ...

How can I configure CRA to automatically append a slash to the end of URLs?

When I install CRA using npx create-react-app app-name and then run yarn start, the development server will by default start at http://localhost:9000. The issue arises when I try to copy the URL from the browser's address bar to my CORS middleware in ...

Is there a way to retrieve the left offset of a floating element even when it is positioned outside the viewport?

My current situation involves creating several panels that are stacked side by side within a main container. Each panel takes up 100% of the viewport width and height. I want to be able to horizontally scroll to each panel when clicking on their respective ...

What is the process for importing JSON from an NPM package in Angular version 15?

I've been dealing with a local package that contains a json file, and my current challenge is to load this json file into the Angular 15 app.component.ts. To bring the json file package into my Angular project, I followed this installation process: n ...

Invoke a Java script function for Regular Expression validation failure in ASP.NET

I have a <asp:RegularExpressionValidator> that validates a text box, and I also have a JavaScript function that prevents entering non-numerical values in the textbox. When I use the expression validator alone, it works fine. However, as soon as I add ...

AWS Lambda Error: Module not found - please check the file path '/var/task/index'

Node.js Alexa Task Problem Presently, I am working on creating a Node.js Alexa Skill using AWS Lambda. One of the functions I am struggling with involves fetching data from the OpenWeather API and storing it in a variable named weather. Below is the relev ...