Extract data from dynamically loaded tables using PhantomJS and Selenium Webdriver

I've been informed by everyone that I should be able to retrieve the table content using PhantomJS when working with JavaScript. However, despite my attempts, I have not been successful.

My expectation was to obtain the table from the website

Page 1 displays correctly.

But when I attempt to navigate to page 2 by clicking on the CSS selector location, it still returns the content from page 1. What could be causing this issue?

#coding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://data.eastmoney.com/xg/xg/default.html")
time.sleep(2)
driver.find_element_by_css_selector("#PageCont > span.at").click()

list_cates = driver.find_element_by_css_selector("#dt_1 > tbody > tr:nth-child(1) > td:nth-child(2) > a").text
print(list_cates)

Answer №1

The issue you are facing is that the data is not being updated immediately after the click event. You should introduce a delay to ensure the Ajax call has enough time to complete.

#coding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://data.eastmoney.com/xg/xg/default.html")
time.sleep(2)
driver.find_element_by_css_selector("#PageCont .next").click()

time.sleep(5)
list_cates = driver.find_element_by_css_selector("#dt_1 > tbody > tr:nth-child(1) > td:nth-child(2) > a").text
print(list_cates)
# Prints '太平鸟'

This code snippet fetches data from page 2

https://i.sstatic.net/RT32I.png

Answer №2

Hey there, check out this code snippet by Tarun Lalwani:

 #coding:utf-8
    from selenium import webdriver
    import time
    from selenium.webdriver.support.ui import WebDriverWait

    driver = webdriver.PhantomJS()
    driver.get("http://data.eastmoney.com/xg/xg/default.html")
    time.sleep(2)
    for page_count in range(1,4):
        driver.find_element_by_id("gopage").send_keys(page_count)
        driver.find_element_by_css_selector("#PageCont > a.btn_link").click()
        time.sleep(10)
        list_cates = driver.find_element_by_css_selector("#dt_1 > tbody > tr:nth-child(1) > td:nth-child(2) > a").text
        print('retrieved ' + str(page_count) + ' items')
        print(list_cates)

View the execution result here

Moreover, trying to fetch a table within a frame using PhantomJS resulted in an error as well.

    #coding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://ipo.csrc.gov.cn/checkClick.action?choice=info#")
driver.find_element_by_css_selector("#type1 > a").click()

time.sleep(5)
result = driver.find_element_by_css_selector("#frame_body > table > tbody > tr:nth-child(1) > td > table > tbody > tr:nth-child(3) > td:nth-child(1)").text
print(result)

Answer №3

I believe I have found a solution for case 1 by implementing this code: driver.find_element_by_id("gopage").clear().

However, I still require your assistance with the other case. Thank you in advance!

#encoding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://data.eastmoney.com/xg/xg/default.html")
time.sleep(2)
for page_count in range(1,4):
    driver.find_element_by_id("gopage").clear()
    driver.find_element_by_id("gopage").send_keys(page_count)
    driver.find_element_by_css_selector("#PageCont > a.btn_link").click()
    time.sleep(5)
    list_cates = driver.find_element_by_css_selector("#dt_1 > tbody > tr:nth-child(1) > td:nth-child(2) > a").text
    print('obtained' + str(page_count) + 'pieces')
    print(list_cates)
    driver.find_element_by_id("gopage").clear()

Answer №4

By simply adding the code snippet `driver.switch_to_frame("myframe")`, I was able to find the solution for case 2.

This particular question required solving within a frame structure!

I want to express my utmost gratitude for assisting me with this issue. This is actually my first time seeking help on Stackoverflow, and I must say, I am loving it already!

#coding:utf-8
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.PhantomJS()
driver.get("http://ipo.csrc.gov.cn/checkClick.action?choice=info#")

time.sleep(2)
driver.switch_to_frame("myframe")
result = driver.find_element_by_css_selector("#frame_body > table > tbody > tr:nth-child(1) > td > table > tbody > tr:nth-child(3) > td:nth-child(1)").text
print(result)

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

How can a server retrieve a file uploaded using FormData and Ajax on a cross-domain upload?

my website is running on localhost:8084 and I need to upload a file to localhost:8086. Below is the JavaScript code: var xhr = new XMLHttpRequest(); xhr.open("post", "http://localshot:8086"+ "?type=ajax",true); xhr.setRequestHeader("X-Reque ...

Using a combination of different materials on a single mesh can lead to problems with z-index and clipping

Currently in my threejs project, I am attempting to apply two different materials to a mesh. One material has a solid color, while the other has a canvas texture. To achieve this, I have created both materials and added them to an array, which is then assi ...

Issue with AngularJS $http not responding to ng-click after first call

My landing controller uses a service that initiates the $http call by default when the App loads. However, I need to pass parameters based on click events, so I implemented an ajax call on ng-click. The issue is that I keep receiving the same data on ng-c ...

Completely different method for transmitting an array to NodeJS through AJAX

Recently, I've encountered difficulties when sending arrays to NodeJS using AJAX. It seems that whenever I try to send it with JSON, the error function is always triggered. Despite seeking explanations, I haven't found a satisfactory answer. The ...

What are some ways to improve performance in JavaScript?

Can you help me determine which approach would be more efficient - using native functions of the language that involve 2 iterations or a simple for loop? The goal is to locate the index in an array of objects where the property filterId matches a specific ...

Unable to construct React/Next project - identified page lacking a React Component as default export (context api file)

When attempting to compile my Next.js project, I encountered an error message in the terminal: Error: Build optimization failed: found page without a React Component as default export in pages/components/context/Context The file in question is utilizing ...

What is the procedure for importing material UI components into the main class?

Hey there! I'm currently working on integrating a "SimpleAppBar" element into my React app design. Below is the code snippet for this element sourced directly from the Material UI official website: import React from 'react'; import PropType ...

Make sure all of the inner tags are properly contained within their respective containers

In my explanatory bubble, I aim to include text within the explain-header div as a container for my explanations: /*Explain Bubble*/ .explain-container { position: relative; overflow: hidden; max-width: 70vw; max-height: 50vh; background-c ...

What is the process for adding to a highly nested array in mongoose?

Here is the Model I am working with: const MessagesSchema = mongoose.Schema({ //for individual message text: { type: String, required: true } }, { timestamps : true }) const MessagesCollectionSch ...

What is the best way to utilize a locator that changes based on a trailing digit within a placeholder in an xpath using Selenium with Python and pytest?

My task involves clicking and adding a few elements successively, each in different div elements with placeholders that only have a trailing digit difference as shown below. for i in range(1,6,1): driver.find_element_by_xpath(abc.element_xpath+str ...

Troubleshooting VueJS, Electron, and Webpack integration with Hot Reload feature

I have been immersed in a new project that involves utilizing Electron, VueJS, and Webpack for HMR functionality. Unfortunately, I am encountering difficulties with the Hot Module Replacement feature not working as expected. Here is my current configurati ...

Asynchronously load an AngularJS controller from AJAX without altering the route

I am looking to dynamically load an Angular controller after making an AJAX call that generates a new view in HTML. Here is the setup I currently have: Example of a View: HTML Snippet From AJAX <!-- CVS Pharmacy Extracare - Add View --> <d ...

Access the value of a JSON property, return null if the key is not found, all in one line of JavaScript code

How about a fun analogy in Python: fruits_dict = {"banana": 4, "apple": 3} num_apples = fruits_dict.get("apple", None) num_oranges = fruits_dict.get("orange", None) print(num_apples, num_oranges) The result would be: 3 None If we switch gears to Jav ...

Using JQuery to locate and substitute occurrences of HTML elements throughout my webpage

Looking for assistance with a JQuery search and replace task involving multiple instances of HTML within a specific DIV element on my webpage. Specifically, I need to change certain items in the textbox to a simpler display format. Here is a snippet of th ...

Utilizing Java with Selenium WebDriver to Export Webtable Data to Excel: A Step-by-Step Guide

:) I've been working on a project to extract web table data and write it into an Excel file using Selenium WebDriver with Java. Currently, I am only able to print the data from the last column of the table, but not the entire content. Can anyone provi ...

The error message "AttributeError: 'module' does not contain a 'document' attribute" indicates that the specified module does not have a

I'm trying to use Python (2.7.10) and Selenium to input a username and password, but I keep encountering the following error. Can anyone help me fix it? CODE:- from selenium import webdriver import selenium driver = webdriver.Chrome("/Users/username ...

Is there a way to fully load an entire HTML page on success using AJAX?

$.ajax({ type:'GET', url:"/" , data:{"user": user}, //user is a variable success:function(data) { I would like to display the entire HTML content received from my views here. } }); ...

Using React to retrieve an object from a helper method through a GET request

I am in the process of turning a frequently used function in my application into a helper method that can be imported wherever it is needed. However, I am facing an issue where I am not getting a response from the function when calling it. I need to figure ...

What is the best way to extract value from an input text that is being repeated using ng-repeat within a Tr

Having a table with repeated rows in angular, you can also dynamically add new rows by clicking a button. If there are already 3 repeated rows and 2 extra rows are added with data entered, how can we retrieve all the data from the table in the controller ...

Sorting the output with gulp and main-bower-files (gulp-order is not functioning)

Hello, I'm a Java engineer diving into the world of Javascript for the first time. Apologies in advance if my information is lacking or incorrect! I am currently working on a gulp build script and utilizing bower to manage dependencies for my JS fron ...