save the data to a CSV file once the process is complete

I've successfully coded a script that scrapes data from the first page, but I encountered an issue when implementing a loop to click on a "load more" button to fetch additional data. After running the loop, the code doesn't export anything to CSV. Is there something wrong with my exporting code? Can someone help me figure out where I'm making a mistake?

const puppeteer = require('puppeteer');
const jsonexport = require('jsonexport');

(async () => {
  const browser = await puppeteer.launch({ headless: false }); // default is true
  const page = await browser.newPage();
  await page.goto('https://www.Website.com/exercises/finder', {
    waitUntil: 'domcontentloaded',
  });

  //load more CSS to be targeted
  const LoadMoreButton =
    '#js-ex-content > #js-ex-category-body > .ExCategory-results > .ExLoadMore > .bb-flat-btn';

  do {
// clicking load more button and waiting 1sec
    await page.click(LoadMoreButton);
    await page.waitFor(1000);

    const loadMore = true;


    const rowsCounts = await page.$eval(
      '.ExCategory-results > .ExResult-row',
      (rows) => rows.length
    );

    //scraping the data
    const exerciseNames = [];
    for (let i = 2; i < rowsCounts + 1; i++) {
      const exerciseName = await page.$eval(
        `.ExCategory-results > .ExResult-row:nth-child(${i}) > .ExResult-cell > .ExHeading > a`,
        (el) => el.innerText
      );
      exerciseNames.push(exerciseName);
    }

    console.log({exerciseNames});
  } while (There are still exercises left);

  const allData = [
    {
      exercise: exerciseNames,
    },
  ];
// exporting data to CSV
  const options = [exercise];
  //json export error part
  jsonexport(allData, options, function (err, csv) {
    if (err) return console.error(err);
    console.log(csv);
  });

  await browser.close();
})().catch((e) => {
  console.error(e);
});

Edit: I have made changes to the exporting and writing to a CSV file portion of the code. Currently, only the exercises are being written with 3 headers visible in the CSV. I want to format it so that each row contains data for exercise name, equipment type, and muscle target group respectively.

Current export code:

 const allData = [
    {
      exercise: exerciseNames,
      muscleGroup: muscleTargets,
      equipment: equipmentTypes,
    },
  ];

  var ws = fs.createWriteStream('test1.csv');


  csv.write(allData, { headers: true, delimiter: ',' }).pipe(ws);

  //json export error part
  jsonexport(allData, function (err, csv) {
    if (err) return console.error(err);
    console.log(csv);
  });

https://i.sstatic.net/YFD5P.png

Edit 2 This block shows my complete code. While it outputs pre-filled information from allData, new data is not being fetched or exported. I need assistance in fixing this issue.

 const puppeteer = require('puppeteer');
const jsonexport = require('jsonexport');
const fs = require('fs');

(async () => {
  const browser = await puppeteer.launch({ headless: false }); // default is true
  const page = await browser.newPage();
  await page.goto('https://www.website.com/exercises/finder', {
    waitUntil: 'domcontentloaded',
  });

  const loadMore = true;

  const rowsCounts = await page.$$eval(
    '.ExCategory-results > .ExResult-row',
    (rows) => rows.length
  );
  let allData = [];
  for (let i = 2; i < rowsCounts + 1; i++) {
    const exerciseName = await page.$eval(
      `.ExCategory-results > .ExResult-row:nth-child(${i}) > .ExResult-cell > .ExHeading > a`,
      (el) => el.innerText
    );
    const muscleGroupName = await page.$eval(
      `.ExCategory-results > .ExResult-row:nth-child(${i}) > .ExResult-cell > .ExResult-muscleTargeted > a`,
      (el) => el.innerHTML
    );
    const equipmentName = await page.$eval(
      `.ExCategory-results > .ExResult-row:nth-child(${i}) > .ExResult-cell > .ExResult-equipmentType > a`,
      (el) => el.innerHTML
    );

    let obj = {
      exercise: exerciseName,
      muscleGroup: muscleGroupName,
      equipment: equipmentName,
    };
    allData.push(obj);
  }
  console.log(allData);

  async function fn() {
    const allData = [
      {
        exercise: 'Rickshaw Carry',
        muscleGroup: 'Forearms',
        equipment: 'Other',
      },
      {
        exercise: 'Single-Leg Press',
        muscleGroup: 'Quadriceps',
        equipment: 'Machine',
      },
      {
        exercise: 'Landmine twist',
        muscleGroup: 'Forearms',
        equipment: 'Other',
      },
      {
        exercise: 'Weighted pull-up',
        muscleGroup: 'Forearms',
        equipment: 'Other',
      },
    ];

    jsonexport(allData, function (err, csv) {
      if (err) return console.error(err);
      console.log(csv);
      fs.writeFileSync('output.csv', csv);
    });
  }
  fn();

  await browser.close();
})().catch((e) => {
  console.error(e);
});

Answer №1

I've identified two key issues that need attention.

I.) The first issue lies within the options declaration:

const options = [exercise]; // ❌

The problem here is attempting to access the exercise property from the allData object without proper notation. To rectify this, you should delve into the first element of the allData array using index [0] and then employ dot-notation to retrieve the exercise property.

const options = [allData[0].exercise]; // ✅

Note: It's advisable to keep the options as simply allData[0].exercise (without the enclosing array) since your allData object is already an array, making the structure unnecessarily deeper offers no benefit.


II.) The second issue pertains to the misuse of the jsonexport npm package. It seems that the usage of allData in this line was accidental:

jsonexport(allData, options, function (err, csv) // ❌

In actuality, only the options are required here (as per the documentation, a single object input is expected):

jsonexport(options, function (err, csv) // ✅

Edit

Your updated answer could resolve the problem by restructuring your allData object to ensure jsonexport accurately interprets each column and row.

const jsonexport = require('jsonexport')
const fs = require('fs')

async function fn() {
  const allData = [
    {
      exercise: 'Rickshaw Carry',
      muscleGroup: 'Forearms',
      equipment: 'Other'
    },
    {
      exercise: 'Single-Leg Press',
      muscleGroup: 'Quadriceps',
      equipment: 'Machine'
    },
    {
      exercise: 'Landmine twist',
      muscleGroup: 'Forearms',
      equipment: 'Other'
    },
    {
      exercise: 'Weighted pull-up',
      muscleGroup: 'Forearms',
      equipment: 'Other'
    }
  ]

  // json export error part
  jsonexport(allData, function (err, csv) {
    if (err) return console.error(err)
    console.log(csv)
    fs.writeFileSync('output.csv', csv)
  })
}
fn()

To achieve this specific structure, you should expand the allData in each iteration like so:

let allData = []
for (let i = 2; i < rowsCounts; i++) {
  const exerciseName = await page.$eval(`...row:nth-child(${i})...`,
    el => el.textContent.trim())
  const muscleGroupName = await page.$eval(`...row:nth-child(${i})...`,
    el => el.textContent.trim())
  const equipmentName = await page.$eval(`...row:nth-child(${i})...`,
    el => el.textContent.trim())

  let obj = {
    exercise: exerciseName,
    muscleGroup: muscleGroupName,
    equipment: equipmentName
  }
  allData.push(obj)
}
console.log(allData)

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Incorporating middleware to handle 404 errors in Express

scenario app.use("/api/tobaccos", tobaccos); app.use(function(err, req, res, next) { console.error(err.message); }); API details: router.get("/:id", async (req, res) => { console.log("GET TOBACCO:" + req.params.id); ...

JavaScript: Specialized gravity diagram

To better understand the issue I am experiencing, please take a look at the image linked below: The concept and problem I am facing is related to creating a weight chart similar to the one shown in the picture or on this site , here is the description of ...

Ensure that only numerical values in decimal form are permitted when entering data in Angular

How can I restrict user input to only decimal values? Here is the HTML code for my input field: <label for="conversion-factor">Conversion Factor:</label> <input type="text" class="form-control form-control-sm" id="conversion-factor" ...

Can a Dashcode Webkit Web app be transformed into traditional HTML and CSS?

I have developed a blog using Dashcode, incorporating HTML, CSS, and Javascript to pull data from an xml file. It's pretty simple... My perspective on this is: 1. JavaScript should be compatible with all browsers (with some variations). 2. I may need ...

What is the solution using high-order functions?

I am struggling with a problem I came across on the internet. The task at hand is to identify all unique elements in an array that are less than 10, calculate the sum of these elements, and then multiply each element in the array by this sum. I have heard ...

What could be the reason why my JavaScript code for adding a class to hide an image is not functioning properly?

My HTML code looks like this: <div class="container-fluid instructions"> <img src="chick2.png"> <img class="img1" src="dice6.png"> <img class="img2" src="dice6.png" ...

Using v-model in Vue 3 will result in modifications to the table class in Bootstrap 5

Below is a snippet of the code I wrote: <table class="table table-striped"> <tr class="table-dark"> <th>#</th> <th>Column 1</th> <th colspan="3">Column 2</th> </tr> <tr ...

Unlock the secrets of recovering deleted data from a text area in Kendo UI Angular

Currently working with Kendo UI for Angular, I need to capture deleted content and remove it from another text area within the same component. My project is using Angular-13. I'm stuck on how to accomplish this task. Any suggestions would be greatly ...

View the selected radio buttons and checkboxes next to each other in real-time before finalizing and submitting

Within my form, individuals have the option to select from radio buttons and checkboxes. The challenge is that I need to display the chosen data on the page before they enter their email and submit! Since I cannot refresh the page, PHP won't work for ...

Predefined date range is set by the OnChange event of Daterangepicker.js

I'm currently exploring the implementation of the onChange event in this select picker with the assistance of the daterangepicker.js library. Unfortunately, after conducting a thorough search on the DateRangePicker repository, I was unable to find any ...

What methods are available to adjust the header color on various pages?

I have a solution that works for one location, but I need to add red color to multiple locations. How can I achieve this? import { useRouter } from "next/router"; function Header() { const router = useRouter(); return ( <> & ...

Revamping Legacy React Native Projects with the Redux Toolkit

Is there a way to integrate redux toolkit with the existing store, reducer, and dispatch methods in my project without creating new ones? I am looking to update react-redux to the latest version. Please provide guidance and assistance. store.js ` import ...

Troubleshooting: Height setting issue with Kendo UI Grid during editing using Jquery

Issue: My Kendo UI JQuery Grid is fully functional except for a bug that occurs when adding a new record. When I add and save a new record, the grid's footer "floats" halfway up the grid and the scrollbar disappears, making it look distorted. Furth ...

Strategies for updating text within a DIV element without an ID, solely relying on a class attribute, by utilizing JavaScript injected into a WKWebView

I've encountered an issue while attempting to update the text within a DIV tag using JavaScript. The tag only has a class attribute, not an ID. I've attempted the following code: document.getElementByClassName('theClassName').innerHTML ...

Custom pagination with onSelectionModelChange in React Material UI Data Grid

Has anyone encountered a problem with using the DataGrid component's onSelectionModelChange prop? I can successfully retrieve the selected rows for a single page, but when I implement custom pagination and navigate to the next page, the previous selec ...

visit a new page with each reload

Can I navigate to a new page without refreshing the current window.location.href? I attempted to achieve this using a jQuery event handler on the window object, however, it doesn't seem to be functioning properly. $(window).on('reload', fu ...

Unable to trigger onClick event in React class-based component

I came across the following code for a class-based component: class PostLike extends Component { constructor(props) { super(props); this.state = { likes: null, like_id: null } this.likeSubmit = t ...

Steps for converting JSON into a structured indexed array

My goal is to efficiently convert the data retrieved from my firebase database into a format suitable for use with a SectionList. I have successfully established a part of the data structure, but I am facing challenges in associating the data correctly. ...

Implementing a design aesthetic to installed React components

After downloading a react multiselect component created by an individual, I installed it using npm install .... and incorporated the component in my react code with <MultiSelect .../>. The dropdown checkbox 'menu' is designed to allow users ...

Utilizing CSS animation triggered by a timer to produce a pulsating heartbeat effect

Here's a unique way to achieve a heartbeat effect on a spinning circle. The goal is to have the circle double in size every second and then shrink back to its original size, resembling a heartbeat. To achieve this, a JavaScript timer is set up to remo ...