Attempting to extract JavaScript URLs using scraping methods, however, receiving an empty string when utilizing

I need help accessing and extracting data from a URL that is embedded within a specific tag. The tag in question looks like this:

<script src="http://includes.mpt-static.com/data/7CE5047496" type="text/javascript" charset="utf-8"></script>

So far, I have attempted to use Selenium to open the URL, but it just returns an empty string. It seems that when I manually click on the source URL, a page opens displaying a table of the desired data. However, pasting the URL directly into a browser results in an empty response. Additionally, each time I refresh the page, a new source URL is generated. Can someone explain why this behavior is occurring?

The URL in question is: view-source:

Below is the relevant portion of my code:

import time
from fake_useragent import UserAgent
import urllib2
import csv
from bs4 import BeautifulSoup
import json
from selenium import webdriver

#FAKE-USER_AGENT
ua = UserAgent(cache = False)
headers = {'User-Agent': ua.randome}


#SENDING REQUEST TO PRICETRACKER WEBSITE
product = 'B00N2BW2PK'
page = requests.get('http://www.mypricetrack.com/amazon/'+str(product), headers = headers)
soup = BeautifulSoup(page.text)
#print(soup.prettify())

#GETTING URL FOR DATA
data_link = []
for tag in soup.findAll('script',{'charset':'utf-8'}):
    data_link = data_link + [tag['src']]
string2 = data_link[1]
print string2
#OPENING URL FOR DATA

driver = webdriver.Firefox()
driver.get(string2)
time.sleep(5)
htmlSource = driver.page_source
print htmlSource

Answer №1

To download JavaScript, you must include the correct "Referer" header in your request.

Instead of using Selenium, a more lightweight option is to fetch it using Python requests:

import requests
import re
from bs4 import BeautifulSoup
# Set browser-like headers
session.headers.update({
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1664.3 Safari/537.36',
    'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'Accept-Language':'en-US,en;q=0.8,es;q=0.6'
})
# Visit product page
product_page = 'http://mypricetrack.com/amazon/B00N2BW2PK'
res = session.get(product_page)
# find link
link = soup.find('script', {'src':re.compile('http://includes.mpt-static.com/data')})
link_src = link['src']
# Get JavaScript content
res = session.get(src, headers={'Referer':product_page}).text

Similar questions

If you have not found the answer to your question or you are interested in this topic, then look at other similar questions below or use the search

Searching for ways to filter out specific tags using regular expressions

Can anyone provide a quick solution to help me with this issue? I am trying to remove all html tags from a string, except for the ones specified in a whitelist (variable). This is my current code: whitelist = 'p|br|ul|li|strike|em|strong|a', ...

Loading fonts using next.js and style jsx

I've recently started the process of converting my create react app to next.js. As a reference, I've been using Vercel's open source Next.js website to help me structure my own. In order to implement custom fonts, I created a font.ts file an ...

AngularJS directive for Ionic Leaflet - Utilizing Service to switch tileLayer from side menu

I am currently experimenting with developing an ionic app for the leaflet-angularjs-directive. Unfortunately, there are not many demos available for me to test out. The specific demo I am using is called ionic-leafletjs-map-demo by calendee on GitHub whic ...

What is the best way to alternate between displaying HTML content with v-html and plain text in Vue.js?

I need a way to switch between v-html and plain text in Vue.js v2. Here's what I have so far: HTML <div id="app"> <h2 v-html="html ? text : undefined">{{html ? '' : text}}</h2> <button @click=&qu ...

Does the organization of files and directories (such as modular programming) impact the speed at which AngularJS loads?

Can breaking code into smaller modules help decrease loading time? Exploring ways to modularize AngularJS applications can lead to a well-structured approach for developing large apps. This approach aims to streamline the development process by organizing ...

The concept of setTimeout and how it affects binding in JavaScript

I have limited experience with jquery, mainly using it for toggling classes. However, I will do my best to explain my issue clearly. I have three div elements and when one is clicked, the other two should rotate 90 degrees and then collapse to a height of ...

Access an object value within a JSON response

My task is to extract servlet details from the JSON response received from our servers. Here is a snippet of the JSON data: if(dataStoreLogFileSize > 10 && "dataStoreLogLevel": "production".) I've attempted to parse the data using the fol ...

Dynamic importing fails to locate file without .js extension

I created a small TS app recently. Inside the project, there is a file named en.js with the following content: export default { name: "test" } However, when I attempt to import it, the import does not work as expected: await import("./e ...

How to pass the Node environment to layout.jade in Express without explicitly specifying the route

Passing parameters to Jade files seems like a piece of cake: app.use('/myroute', function (req, res) { res.render('myview', {somevar: 'Testing!'}); }); But, I have my layout.jade file that is automatically read and rendere ...

Removing a value from an array contained within an object

I have a scenario in my task management application where I want to remove completed tasks from the MongoDB database when a logged-in user marks them as done. Below is the snippet of code for my Schema. const user = new mongoose.Schema({ username : Str ...

Setting up authorization levels for roles in Discord.js

Hi everyone, I came across this script that deals with users posting invite links. How can I whitelist specific channels to prevent the bot from banning or kicking users for posting invite links? Any help would be greatly appreciated. Thank you. adminCli ...

Extend and retract within a row of a table

I need help with modifying a dynamically generated table to meet specific requirements. The table currently looks like this. The task involves hiding all columns related to Category B, eliminating duplicate rows for Category A, and displaying entries from ...

VueJS restricts the selection of only one checkbox based on its class name

I am working with a group of checkboxes: <div class="options"> <input type="checkbox" value="10" v-model="choices"> <input type="checkbox" value="30" v-model="choices"> <div class="group"> <input type="checkbox" value= ...

What is the best way to invoke a TypeScript function within a jQuery function?

Is it possible to invoke a TypeScript function within a jQuery function? If so, what is the correct approach? Here is an example of my component.ts file: getCalendar(){ calendarOptions:Object = { height: 'parent', fixedWeekCount : ...

The WebGLRenderer in ThreeJS is unable to update the domElement property

During the development of a ThreeJS project, I came across this particular error: Uncaught TypeError: Cannot set property 'domElement' of undefined The filename before being converted to regular JS for browsers is: index.js import { Scene, Web ...

Is it necessary to make multiple calls following a successful AJAX request?

Here is an AJAX call I am setting up. The first step is to hit the endpoint dofirstthing.do. Once that is successful, the next step is to make another call with "param1" as the query parameter. Now, my question is - how can I make a third call with "param ...

The content within the iframe is not displayed

I've set up a dropdown menu with separate iframes for each option. Here's the code I used: $(function(){ $('#klanten-lijst').on('change',function(){ $('#klanten div').hide(); $('.klant-'+t ...

Using JavaScript drag and drop feature to remove the dragged element after a successful drop operation

Trying to figure out how to remove a dragged element from the DOM after a successful drop using the native JavaScript drag and drop API. After attempting to listen for the drop event, it seems that it only fires on the element being dropped onto and does ...

Combine all parameters into a single parameter, called 'useParams', displaying all values

While creating breadcrumbs that utilize navigate logic (navigate(-1)), I am running into an issue where the breadcrumbs need to exclude any URL parameters. My routes are constructed with various service providers. For instance, the '/' route con ...

Activate the input autofocus feature when displaying a dialog in Vue.js

When opening a dialog with input text using v-menu upon clicking a button, how can I focus on the input text field? I have attempted to use $ref but it does not seem to work. newFolderClick(){ this.$refs["input_new_folder"].focus(); //it still appea ...