I am facing a challenge when trying to scrape a webpage post login using BeautifulSoup and requests.
Initially, I encountered a roadblock where the page requested JavaScript to be enabled to continue using the application.
To work around this issue, I decided to utilize html_requests
with the code snippet below:
from requests_html import HTMLSession
session = HTMLSession()
session.get(url)
session.post(loginUrl, data = {"email":"<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="01646c60686d41666c60686d2f626e6c">[email protected]</a>", "password": "Pass123"})
resp.html.render()
Despite this, I continued to face the same error or encountered:
pyppeteer.errors.PageError: net::ERR_SSL_VERSION_OR_CIPHER_MISMATCH
As a result, I opted to use selenium, even though my preference is request due to its faster script speed.
Although the selenium approach worked well, upon loading the selenium page source into BeautifulSoup, I encountered the
Please enable JavaScript to continue using this application.
error page once again.
This has left me puzzled as the driver loads successfully and I simply parse the HTML page from selenium.
Any suggestions on how to resolve both the requests_html
and BeautifulSoup
issues?