I have a need to extract data from a webpage for scientific research purposes. The specific text I'm looking to extract is found within a < span > tag, but traditional HTML parsing methods won't work due to the rapid and constant updates happening, sometimes up to 10 times per second. Despite this challenge, I am aware that it can be achieved based on information from a scientific paper I came across.
The webpage where I need to gather this data from is: . Essentially, each time a paper is downloaded, a marker appears on the map indicating its location. My goal is to collect real-time data on the city/location associated with each marker as they appear, displayed beneath the map on the left side.
Questions:
1) How can I effectively parse this ever-changing text in real-time, especially considering that it's dynamically generated using Java-script code? While I have some experience with webpage parsing, handling fast-paced live text updates is new territory for me.
2) Given the importance of speed in both parsing and writing this data, which programming language would be best suited for my project? I intend to store the extracted data in an SQL database, so efficiency is key. If possible, I prefer to use Python provided there are robust libraries available for this purpose.
Thank you in advance for any guidance or recommendations you may have.