Encountered issues loading JavaScript and received a pyppeteer error while trying to access a website through requests

I am facing a challenge when trying to scrape a webpage post login using BeautifulSoup and requests.

Initially, I encountered a roadblock where the page requested JavaScript to be enabled to continue using the application.

To work around this issue, I decided to utilize html_requests with the code snippet below:

from requests_html import HTMLSession

session = HTMLSession()

session.get(url)
session.post(loginUrl, data = {"email":"<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="01646c60686d41666c60686d2f626e6c">[email protected]</a>", "password": "Pass123"})


resp.html.render()

Despite this, I continued to face the same error or encountered:

pyppeteer.errors.PageError: net::ERR_SSL_VERSION_OR_CIPHER_MISMATCH

As a result, I opted to use selenium, even though my preference is request due to its faster script speed.

Although the selenium approach worked well, upon loading the selenium page source into BeautifulSoup, I encountered the

Please enable JavaScript to continue using this application.

error page once again.

This has left me puzzled as the driver loads successfully and I simply parse the HTML page from selenium.

Any suggestions on how to resolve both the requests_html and BeautifulSoup issues?

Answer №1

If you want to access data without the need for pyppeteer or selenium, you can simply log in using basic requests.

The crucial step is to retrieve the accessToken from the Login endpoint and then apply it to subsequent requests.

The API calls I'm utilizing here provide the essential information on the page post-login. The rest of the HTML serves mainly as visual decoration. The data obtained from the API mirrors what is visible on the website:

https://i.sstatic.net/pI054.png

As for the

pyppeteer.errors.PageError: net::ERR_SSL_VERSION_OR_CIPHER_MISMATCH
, this error typically occurs due to a failure in the SSL/TLS handshake. It may be caused by the server using an outdated or unsupported SSL/TLS version or cipher suite.

For more insights on this error, you can refer to this link.

TL;DR: Unfortunately, there isn't much you can do about it.

I suggest adopting my method (relying on API calls without the need for a browser).

The advantages of this approach include:

  • lightweight
  • relatively fast
  • no SSL errors
  • full data access

Here's the procedure to retrieve sales data:

import requests
from dateutil.parser import parse

login_url = "https://api-it.saywow.me/it-it/api/Users/Login"
sales_url = "https://api-it.saywow.me/it-it/api/Booking/GetCanBookSaleEvents"
payload = {
    "email": "YOUR_EMAIL",
    "password": "YOUR_PASSWORD",
}

# Define functions for formatting and displaying sales data

def main() -> None:
    with requests.Session() as session:
        response = session.post(login_url, json=payload)
        token = response.json()["data"]["accessToken"]
        sales = session.post(
            sales_url,
            headers={"Authorization": f"Bearer {token}"},
        )
        show_sales(sales.json()["data"])

# Execute the main function

if __name__ == "__main__":
    main()

Upon entering your email and a valid password, the output should resemble this:

Event: HOUSE OF LUXURY
Address: Viale John Fitzgerald Kennedy 54, Napoli NA
Dates: 08 December - 17 December
Booked: You can book this event!


Event: Monot Archive Sale
Address: Via Orobia 11, Milano MI
Dates: 28 November - 06 December
Booked: You can book this event!

The sales_data table contains additional details such as location, phone numbers, etc.

For example:

...

"addressName": "Via Orobia",
"addressNumber": "11",
"addressCity": "Milano",
"addressProvince": "MI",
"addressZip": "20139",
"addressCountry": "IT",
"addressLat": 45.4426322,
"addressLon": 9.2056631,

...

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Transfer the data in the columns of Sheet1 to Sheet2 and eliminate any duplicates using Google App Script

Is there a way to transfer only unique rows from a SOURCE Spreadsheet to a DESTINATION spreadsheet? Spreadsheet #1 (SOURCE) - This sheet contains ID's and Names, but has duplicate rows. There are over 500k rows in this sheet and it is view-only. Spre ...

What is the most effective way to retrieve the count of users who have logged in within the past three months by utilizing Jquery

I am seeking to retrieve the count of users who have logged in the last three months utilizing a JSON API and Jquery. This is my current progress: $.getJSON('users.json', function(data) { var numberOfUserLogged = 0; var d1 = ...

I would like a div element to slide up from the bottom of the page

Could someone please assist me in creating a popup div that appears from bottom to top when a specific button is clicked? I would like the div to be fixed without affecting the overall page content. Any help would be greatly appreciated. Thank you! $( ...

What could be causing my Mocha reporter to duplicate test reports?

I've created a custom reporter called doc-output.js based on the doc reporter. /** * Module dependencies. */ var Base = require('./base') , utils = require('../utils'); /** * Expose `Doc`. */ exports = module.exports = ...

Fill the second dropdown menu options based on the selection made in the first dropdown menu

I need assistance with dynamically populating my second drop-down menu based on the selection made in the first drop-down. Here are the steps I've taken so far: form.php - Utilizing javascript, I have set up a function to call getgeneral.php. The se ...

Automatically closing the AppDateTimePicker modal in Vuexy theme after selecting a date

I am currently developing a Vue.js application using the Vuexy theme. One issue I have encountered is with a datetimepicker within a modal. The problem arises when I try to select a date on the datetimepicker component - it closes the modal instead of stay ...

The hover functionality is not functioning as expected on the demo website when using Selenium WebDriver

I have attempted to use the following code. public class LocateMultipleItems { public static void main(String[] args) { WebDriver driver = new FirefoxDriver(); driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS); dr ...

What is the best way to include a class with Knockout JS?

After going through the basic Knockout tutorial and examining various examples, I decided to try it out myself on jsFiddle. However, I encountered some issues as my attempts did not quite work. The goal is simple - I just want to add the class open to a d ...

Perform a Fetch API request for every element in a Jinja2 loop

I've hit a roadblock with my personal project involving making Fetch API calls to retrieve the audio source for a list of HTML audio tags. When I trigger the fetch call by clicking on a track, it always calls /play_track/1/ and adds the audio player ...

What is the best way to transmit real-time stdout data from a Node.js server to an AngularJS client?

I am currently facing an issue with a script that runs for a long time and generates output. The script is executed from nodejs using child_process, but I want to be able to send the output as soon as it starts executing without waiting for the script to c ...

Activate a jQuery collapsible feature through an external hyperlink

Can we enable the expansion of a jQuery collapse by clicking on an external link? For instance, let's say we have a link on the home page that leads to another page. When the user clicks on this link from the home page, we want it to redirect to the i ...

What is the solution to the error message stating that <tr> cannot be a child of <div>?

displayTodos() { return this.state.todos.map(function(item, index){ return <div todo={item} key = {index}>; <tr> <td>{item.todo_description}</td> <td>{item.todo_responsible}</td> ...

unable to locate the allong.es variadic function

Encountering an Error node.js:201 throw e; // process.nextTick error, or 'error' event on first tick ^ TypeError: undefined is not a function at /home/ubuntu/nodejs/test.js:4:10 at factorial (/home/ubuntu/nodejs/test.js:17: ...

Rejuvenate a just-launched window.open starting from the about:blank

After receiving data from an ajax result, I am trying to open a pdf in a new window. However, the pdf viewer is only displayed if I manually resize the window (using manual hotspot resizing). How can I ensure that the contents display properly in its popu ...

"Utilizing a dynamic global variable in Node.js, based on the variable present in

I'm looking to achieve the following: var userVar={} userVar[user]=["Values"]; function1(user){ //calculations with userVar[user] } function2(user){ //calcula ...

I'm having trouble with my bootstrap dropdown and I've exhausted all of my options trying to fix it based on my current understanding

My dropdown menu is not working despite adding all necessary Bootstrap CDN and files along with the jQuery script. Even with the table being handled by JavaScript, the button does not respond when clicked repeatedly. I suspect the issue lies within the han ...

Stopping a velocity.js animation once it has completed: is it possible?

For the pulsating effect I'm creating using velocity.js as a fallback for IE9, refer to box2 for CSS animation. If the mouse leaves the box before the animation is complete (wait until pulse expands and then move out), the pulsating element remains vi ...

Corporate firewall causing issues with AJAX call execution

Currently, I am utilizing jQuery's $.ajax() method to retrieve approximately 26KB of JSONP data. All major browsers including FF, Chrome, IE, and Safari are successfully returning the data from various locations such as work, home, and mobile phone w ...

Can you identify the issue with my database file?

Here is the content from my database.js file: const MongoClient = require('mongodb').MongoClient; const db = function(){ return MongoClient.connect('mongodb://localhost:27017/users', (err, database) => { if (err) return co ...

One jQuery plugin was functioning perfectly, while the other one was failing to work as expected

I am working on a simple HTML file that includes a heading and two empty div elements. <h1>Blablabla.</h1> <div id="div1"></div> <div id="div2"></div> These two divs need to be populated with SVG content generated usin ...