Attempting to retrieve files from under headless conditions has proven challenging. Despite having a free account, the website seems to employ a series of javascript forms and redirections that complicate the process. In Firefox, I can extract the download URL using the element inspector and convert it into cURL to initiate the download on a headless machine. However, all my attempts to directly download the file on the headless machine have been unsuccessful so far.
I have successfully logged in with the following script:
#!/usr/bin/env python3
username="<my username>"
password="<my password>"
import requests
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities.PHANTOMJS
caps["phantomjs.page.settings.userAgent"] = "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0"
driver = webdriver.PhantomJS("/usr/local/bin/phantomjs")
driver.set_window_size(1120, 550)
driver.get("http://www.oracle.com/technetwork/server-storage/developerstudio/downloads/index.html")
print("loaded")
driver.find_element_by_name("agreement").click()
print("clicked agreement")
driver.find_element_by_partial_link_text("RPM installer").click()
print("clicked link")
driver.find_element_by_id("sso_username").send_keys(username)
driver.find_element_by_id("ssopassword").send_keys(password)
driver.find_element_by_xpath("//input[contains(@title,'Please click here to sign in')]").click()
print("submitted")
print(driver.get_cookies())
print(driver.current_url)
print(driver.page_source)
driver.quit()
While the login appears successful as indicated by the cookies containing data related to my username, submitting the form in Firefox triggers the download after several redirections. However, in this instance, the page_source
and current_url
still reflect the login page, with no download initiation.
It's possible that the website is actively preventing such actions or that I may be missing something crucial. Any suggestions on how to proceed with downloading the file?