Scraping a few URLs with Javascript for Web Data Extraction

I'm struggling to retrieve data from multiple URLs and write it to a CSV file. The problem I'm facing is that the fetched data is not complete (I expect 10 items) and it's not in the correct order. Instead of getting 1, 2, 3 sequentially, I receive random numbers like 6, 10, 5, 1... Sometimes, I get six h3 values, sometimes five, and this discrepancy seems to happen randomly. Although my URL addresses are correct, using async await syntax hasn't resolved the issue. As a beginner, I've included my code below:

const request = require('request');
const cheerio = require('cheerio');
const fs = require('fs');
const writeSteam = fs.createWriteStream('data.csv');

let data= '';
const numOfFetchData = 10;
const numbers = Array.from(Array(numOfFetchData + 1).keys());

async function getData() {
    for await (const number of numbers) {
        request('randomURL/' + (number+1), (err, res, html) => {
            if(!err && res.statusCode == 200 && (number+1) <= numOfFetchData) {
                const $ = cheerio.load(html);
                const h3Tag = $("h3")[0].children[0].data;
                data += (number + 1) + ' ' + h3Tag + '\n'   
            } else {
                writeSteam.write(`${data}`); 
            }
        });
    };
};

getData();

What can I do to enhance my code?

Thank you and Regards!

Answer №1

After closely examining your code, it appears that the request library does not utilize promises but instead works with callbacks, rendering async/await unusable. If you truly want your code to fetch data in sequence, you can

  1. Implement recursion to trigger the next request only after the previous one is completed:
async function fetchData(numbers) {
    request('randomURL/' + (numbers[numbers.length - 1] + 1), (err, res, html) => {
        numbers.pop()
        if(!err && res.statusCode == 200 && (number+1) <= numOfFetchData) {
                const $ = cheerio.load(html);
                const h3Tag = $("h3")[0].children[0].data;
                data += (number + 1) + ' ' + h3Tag + '\n'   
        } else {
                writeSteam.write(`${data}`); 
        }
        if (numbers.length > 0) fetchData(numbers);
    });
};

fetchData(numbers);
  1. If the order of fetching data doesn't matter as long as the results match the initial number array sequence, consider using fetch (a promise-based library) instead of request:
async function fetchData() {
    let fetchPromises = numbers.map(number => fetch('randomURL/' + (number+1)));
    const results = await Promise.all(fetchPromises); // results are ordered
    // Process the results
};

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

How can I properly implement a Closure or IIFE to manage an onclick event in ReactJS?

I'm encountering an issue while attempting to utilize the this object in an event handler. An undefined error related to the this object keeps popping up. My development stack includes ReactJS and Redux as well. class Chat extends Component { c ...

Utilizing jQuery UI autocomplete with AJAX and JSON data sources

I've been facing an issue with the jQuery UI autocomplete feature, and despite extensive research, I have only managed to partially resolve it. The code below is what I'm currently using to make it work: $("#term").autocomplete({ source: fun ...

Having Trouble with Angular 6 Subject Subscription

I have created an HTTP interceptor in Angular that emits a 'string' when a request starts and ends: @Injectable({ providedIn: 'root' }) export class LoadingIndicatorService implements HttpInterceptor { private loadingIndicatorSour ...

Can you increase all px measurements in Notepad++ by a factor of X?

Looking for help with a large HTML image map that contains over 3000 lines of images with specific top/left pixel positions. I'd like to replace these images with larger ones, which would require increasing all the pixel references by a certain amount ...

Transmitting a plethora of information using jQuery

Here's the code I currently have for sending data: var test={imagename:"apple.jpg",x:"13",y:"33"}; $.ajax({ type: "POST", url: "some.php", data: test, success: function(response){ console.log(response); } }); ...

JavaScript nested function that returns the ID of the first div element only upon being clicked

I am facing an issue with a function that returns the id of the first div in a post when an ajax call is made. The problem is that it repeats the same id for all subsequent elements or div tags. However, when the function is used on click with specified ...

Tips for utilizing Angular Js to redirect a webpage

Can someone help me figure out how to redirect to another page using Angular Js? I've searched through various questions here but haven't found a successful answer. This is the code I'm currently working with: var app = angular.module(&ap ...

Adding a .PHP File to Two Separate DIVs

I am using wordpress to create a custom theme. I'm interested in placing all the content from my slider.php file inside a div box on my index.php page. How would I go about doing this? SLIDER.PHP <div class="slider"> // All the image tags wit ...

VueJS functions properly on Google Chrome, however it may encounter compatibility issues when using

I am currently working on a VueJs app resembling an auction, with the backend powered by Laravel. Everything runs smoothly when I test it on Chrome, but for some reason, Safari seems to be giving me trouble. The app consists of two main components: Deale ...

What is the best way to link the width and height of a div with form fields?

I need to implement a feature where I can create multiple div elements that are draggable and resizable, and have their properties like width, height, etc. linked to corresponding objects in an array. For example, if I create six divs, there should be six ...

What could be the reason for an async function to send an empty object in the request body?

I'm currently utilizing nuxt.js, mongoDB, express, and bodyParser as well Unfortunately, the solutions provided by others do not solve my issue, as having bodyParser does not seem to fix it. The uploadPet function is designed to collect form data an ...

Tips for utilizing Variant on an overridden component using as props in ChakraUI

I created a custom Component that can be re-rendered as another component using the BoxProps: export function Label ({ children, ...boxProps }: BoxProps) { return ( <Box {...boxProps}> {children} </Box> ); } It functio ...

Whenever the selected option in an HTML dropdown menu is modified, a corresponding input field should be automatically adjusted

Within my Rails application, I am facing a challenge related to updating the value of a text_field when a user chooses a different option from a select tag. The select tag is populated from a model which contains a list of countries for users to choose fro ...

Change background according to URL query

My goal is to dynamically change background images based on a URL parameter, specifically the destination code. With numerous background images available, it's crucial to display the correct one depending on the URL string. For instance, if the URL re ...

Comparing JS Async/Await, Promise, and Callbacks: Which is Best

I'm trying to wrap my head around the differences between callbacks, promises, and async/await. While I understand how callbacks and promises work, I'm struggling with grasping the usage of async/await. I know it's essentially a syntactic su ...

Attempting to extract decibel levels from an audio file using JavaScript

I've been exploring the details provided here: Is there a way get something like decibel levels from an audio file and transform that information into a json array? However, when attempting to execute the JSBin snippet below, I encountered some conf ...

Explore in MegaMenu Pop-up

At my workplace, the internal web portal features a MegaMenu with a popup menu that includes a Search input field. The issue I am encountering is that when a user starts typing in the search bar and moves the mouse off of the megamenu, it disappears. It ...

The implementation of an onclick event in a freshly created HTML element is functioning solely on the final element

One issue I encountered was that when I generated page number buttons at the bottom of a page, the onclick event only worked with the last button created. Below is an example of how the page buttons were displayed: https://i.stack.imgur.com/wHwI0.png ...

Learn the ins and outs of utilizing *ngIf in index.html within Angular 8

Can anyone explain how I can implement the *ngIf condition in index.html for Angular2+? I need to dynamically load tags based on a condition using the *ngIf directive, and I'm trying to retrieve the value from local storage. Below is my code snippet b ...

What methods can I utilize to transmit Global variable data from a view to a controller?

In my Angular view file, I have the following code snippet. <!DOCTYPE html> <video id="myVideo" class="video-js vjs-default-skin"></video> <script> var dataUri; var videoData; var player = videojs("myVideo", { controls ...