Good day extract a collection of articles

I am trying to parse out the date and full URL from articles.

const cheerio = require('cheerio');
const request = require('request');
const resolveRelative = require('resolve-relative-url');
        request('https://www.moneyweb.co.za/', function (error, response, html) {
            if (!error && response.statusCode == 200) {
                const $ = cheerio.load(html);
                $('.border0010-dotted').each(function (i, element) {
                    const title = $(this).find('.title').text().trim()
                    const url = resolveRelative($(this).find('.a href').text().trim(), response.request.uri.href)
                    const date = $(this).attr('.inline-block')
                    const description = $(this).find('.excerpt').text().trim()
                    const feedItem = {
                        title: title,
                        description: description,
                        url: url,
                        date: date
                    }
                    console.log(feedItem)
                })
            }
    });

This is an example output:

{ title: 'Hiring a new bank CEO rarely improves the share price',
  description: 'New CEOs have done little to boost Europe bank stocks.',
  url: 'https://www.moneyweb.co.za/',
  date: undefined }

Any suggestions on how I can retrieve the date and full URL?

Answer №1

Here are a few things to consider:

  • const date = $(this).attr('.inline-block')
    should use find() instead of attr(). Some containers may have different date locations, but
    $(this).find(".meta .inline-block:last").prev().text();
    seems to work for all.
  • The .border0010-dotted element appears at the bottom of the article with less data, so it's recommended to refine the selector to the main content using
    #home-panel-loop .border0010-dotted
    .
  • You can extract the URL from the .title a element by accessing its href attribute. Avoid using .a href as it looks for specific tags that might not exist.
const cheerio = require("cheerio"); // 1.0.0-rc.12
const request = require("request");

request("https://www.moneyweb.co.za/", function (error, response, html) {
  if (!error && response.statusCode === 200) {
    const $ = cheerio.load(html);
    $("#home-panel-loop .border0010-dotted").each(function () {
      const title = $(this).find(".title").text().trim();
      const url = $(this).find(".title a").attr("href");
      const date = $(this)
        .find(".meta .inline-block:last")
        .prev()
        .text()
        .trim();
      const description = $(this).find(".excerpt").text().trim();
      const feedItem = {title, description, url, date};
      console.log(feedItem);
    });
  }
});

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Can the bottom border on an input textfield be removed specifically under the new character?

This is the new visual design I received: https://i.stack.imgur.com/kiQGe.png The label CLG is represented as a label, with the line being an input type=tel. Disregard the purple overlay... The designer has requested that I remove the border when a user ...

Error: The property 'ss' cannot be accessed because it is undefined

Our main source page will be index.html, while Employees.html is where our results end up. An error occurred: TypeError - Cannot read property 'ss' of undefined Error in the code: let rating = req.body.ss; Seeking assistance please >< C ...

Querying JSON with Node.js and PostgreSQL

Utilizing node-pg, my goal is to search for a specific string within a JSON object. For example, here is a snippet from a row: { viq_id: '801583', title: 'Blank, security key, lock system, and production method', applicants: [ ...

The animations in three.js have come to a standstill

Currently, I am working on a real-time game using three.js and websockets. The project has been running smoothly until recently when I encountered a hurdle. While implementing animations, I noticed that the animations for the opposing client on the web pag ...

Using JavaScript to print radio type buttons

Currently working on a web page and I've encountered a problem that's got me stumped. There are two sets of radio buttons - the first set for package dimensions and the second set for weight. The values from these buttons are assigned to variable ...

Angular deep nested router interface

How can I set up nested views in Angular for the following routes? /#/app/dashboard /#/app/product/ /#/app/product/new Here is my current route configuration: $stateProvider .state('app',{ url: '/app', templateUrl ...

receiving an error message due to attempting to access an

I'm facing an issue with replacing objects in an array using JavaScript. I tried to use indexOf and splice methods, but it's not working as expected. The value returned by indexOf is '-1', indicating that the object is not found in the ...

What is the best way to determine if a Google Apps user is not an administrator?

We have developed an app for Google Apps and incorporated the "Integrate with Google" button [https://developers.google.com/apps-marketplace/button]. One issue we're facing is that when a user clicks on this button, they must be an administrator. Howe ...

Using jQuery to handle multiple buttons with the same ID

Is there a way to address the issue of multiple buttons sharing the same id? Currently, when clicking on any button, it only updates the record with id=1. How can this problem be resolved? div id="ShowPrace"> <?php try { $stmt = $pdo->prepare(" ...

SlickGrid checkbox formatter/editor experiencing issues with data consistency

Exploring the functionalities of SlickGrid'seditors/formatters, I delved into how these features could potentially alter the data object utilized for constructing the grid (as modifications within the table are usually reflected in the original data o ...

What is the best way to switch the CSS style of a button that has been mapped

I'm currently working on toggling the CSS class for an individual button that is created from a mapped array. Although my code is functional, it toggles the CSS class for all buttons in the mapped array instead of just the one selected. ...

Prevent animations on child elements with Vue.js

I am dealing with a scenario where I want to remove the fade transition on a child div within a <transition> component. The reason for nesting it is to prevent layout flickering, which can be demonstrated in a fiddle if necessary. In the fiddle belo ...

Error: The function or method save() has not been resolved

The function model.save() for mongoose is not being properly defined. models/genre.js 'use strict'; const mongoose = require('mongoose'); const Schema = mongoose.Schema; const GenreSchema = new Schema({ name: {type: String, requi ...

NPM is encountering difficulties resolving the dependency tree

This query has already been asked before. I have attempted to execute various commands like npm i, npm install, npm update and more on this project that I pulled from a private git repository. Unfortunately, none of them seem to work. I even tried deleting ...

Having trouble with document.getElementById.innerHTML not displaying the correct content?

document.getElementById works in getting the element, but for some reason, the content disappears as soon as it is written in it. There are no errors on the console to indicate what's causing this behavior. I have been unable to identify the reason b ...

Angularfire allows for easy and efficient updating of fields

Currently, I am working on creating a basic lateness tracker using AngularFire. So far, I have successfully added staff members to the miniApp and set the initial late minutes to a value of 0. My challenge lies in updating these values. Ideally, when a us ...

Can we expand the capabilities of a ThreeJS object in any way?

In my ThreeJS project, I am implementing an interactive feature where users can click on cubes that behave differently when clicked, such as having different color animations. To achieve this, I plan to create extension classes for the THREE.Mesh object a ...

Launching Node and Mongo with docker-compose fails to initiate both services

Issues and Errors: When running docker-compose run mongo, Mongo container starts successfully However, when running docker-compose run iotmap or docker-compose up, only the node container starts but not the Mongo container 1a) Running docker-compose ps ...

Display an icon from the glyphicon library in an Angular application based on a

My Controller: .controller('BlogController', function(blogFactory, $routeParams, $scope){ var that=this; stat=false; this.checkbookmark = function(bId){ console.log(bId) blogFactory.checkBookmark(bId, function(response){ ...

Empty req.params in nested ExpressJS routers

I have a unique routing system that relies on the directory structure of my 'api' folder to automatically configure routes. However, I encountered an issue where the req.params object is undefined in the controller when using a folder name as a r ...